algorithm - HSV color removal/dropout of form fields -
i'm writing system dropout field borders form image. fields may have writing in them need correctly keep if handwriting crosses field border.
i have 2 images: 1 color image (converted hsv colorspace) , 1 black/white image line pixel per pixel (these produced scanner)
i remove (pluck) field border pixels black , white image, given colors in color image.
i have advantage in know apriori exact location of field, , widths/heights of field border lines.
my current implementation consists of (for each field), scanning field border on color image , calculating average hsv value field border (since know field border is, visit "field border" pixels, may visit few handwriting pixels if cross field border, idea won't skew average much). once have "average" hsv value field border, scan field border again, , each pixel compute following delta function:
if delta value between "current" pixel , average hsv less 0.07 (found empirically) set pixel white (colors close together), otherwise keep pixel black.
here examples of field:
color image: black&white image non-dropped out: dropped out black&white image saturation not used in equation: actual dropped out black & white image formula used in full (using 3 components h,s & v)
the formula i'm using 3rd dropped out image above formula left saturation out of equation (i playing around things).
this not delicate enough color variations formula sensitive saturation changes (this caused jpeg compression artifacts exist within image (example artifacts):
i think 4th example best because it's sensitive color variations you're less remove handwriting, problem you're more prone pick border because of slight color differences caused simple scanning or compression artifacts.
what thoughts alleviate of color (saturation) variations occur within field border, use histograms? quantization involved there reduce number of bins?
i'd hear ideas people have.
thank you.
you might results if apply machine learning techniques problem.
for instance, if want label every pixel in image either field border or not field border try hand labeling pixels in few images, computing bunch of features (you using color think oriented gradients might give results well) , dump support vector machine (svm).
opencv provides implementations of svms , gradient based features (descriptors) if familiar c++ or python:
alternatively matlab provides code train svms , compute gradient features well.