问题
I would like to find out how to extract from this pdf(ex. image) http://postimg.org/image/ypebht5dx/
For example, I want to extract only the values in the column "TENSIONE[V]" and if it encounters a blank cell I enter the letter "X" in the output. How could I do?
The code I used is this:
PDDocument p=PDDocument.load(new File("a.pdf"));
PDFTextStripper t=new PDFTextStripper();
System.out.println(t.getText(p));
and I get this output:
http://s23.postimg.org/wbhcrw03v/Immagine.png
回答1:
These are just guidelines. Use them upon your use. This is not tested either, but help you solve your issue. If you have any question let me know.
String text = t.getText(p);
String lines[] = text.split("\\r?\\n"); // give you all the lines separated by new line
String cols[] = lines[0].split("\\s+") // gives array separated by whitespaces
// cols[0] contains pins
// clos[1] contains TENSIONE[V]
// cols[2] contains TOLLRENZA if not present then its empty
来源:https://stackoverflow.com/questions/16217999/java-pdfbox-extract-data-from-a-column-of-a-table