algorithm - HSV color removal/dropout of form fields -


i'm writing system dropout field borders form image. fields may have writing in them need correctly keep if handwriting crosses field border.

i have 2 images: 1 color image (converted hsv colorspace) , 1 black/white image line pixel per pixel (these produced scanner)

i remove (pluck) field border pixels black , white image, given colors in color image.

i have advantage in know apriori exact location of field, , widths/heights of field border lines.

my current implementation consists of (for each field), scanning field border on color image , calculating average hsv value field border (since know field border is, visit "field border" pixels, may visit few handwriting pixels if cross field border, idea won't skew average much). once have "average" hsv value field border, scan field border again, , each pixel compute following delta function:

enter image description here

if delta value between "current" pixel , average hsv less 0.07 (found empirically) set pixel white (colors close together), otherwise keep pixel black.

here examples of field:

color image: enter image description here black&white image non-dropped out: enter image description here dropped out black&white image saturation not used in equation: enter image description here actual dropped out black & white image formula used in full (using 3 components h,s & v) enter image description here

the formula i'm using 3rd dropped out image above formula left saturation out of equation (i playing around things).
this not delicate enough color variations formula sensitive saturation changes (this caused jpeg compression artifacts exist within image (example artifacts):

enter image description here

i think 4th example best because it's sensitive color variations you're less remove handwriting, problem you're more prone pick border because of slight color differences caused simple scanning or compression artifacts.

what thoughts alleviate of color (saturation) variations occur within field border, use histograms? quantization involved there reduce number of bins?

i'd hear ideas people have.

thank you.

you might results if apply machine learning techniques problem.

for instance, if want label every pixel in image either field border or not field border try hand labeling pixels in few images, computing bunch of features (you using color think oriented gradients might give results well) , dump support vector machine (svm).

opencv provides implementations of svms , gradient based features (descriptors) if familiar c++ or python:

alternatively matlab provides code train svms , compute gradient features well.


Popular posts from this blog

How to calculate SNR of signals in MATLAB? -

c# - Attempting to upload to FTP: System.Net.WebException: System error -

ios - UISlider customization: how to properly add shadow to custom knob image -