Java PDFBox, extract data from a column of a table -
i find out how extract pdf(ex. image) http://postimg.org/image/ypebht5dx/
for example, want extract values in column "tensione[v]" , if encounters blank cell enter letter "x" in output. how do?
the code used this:
pddocument p=pddocument.load(new file("a.pdf")); pdftextstripper t=new pdftextstripper(); system.out.println(t.gettext(p));
and output:
these guidelines. use them upon use. not tested either, solve issue. if have question let me know.
string text = t.gettext(p); string lines[] = text.split("\\r?\\n"); // give lines separated new line string cols[] = lines[0].split("\\s+") // gives array separated whitespaces // cols[0] contains pins // clos[1] contains tensione[v] // cols[2] contains tollrenza if not present empty