read text from a particular page using PDFBox [duplicate]

99封情书 提交于 2019-12-06 21:13:24

问题


I know how to read text of an entire pdf file usinf PDFBox using PDFTextStripper.getText(PDDocument).

I also have a sample on how to get an object reference to a particular page using PDDocumentCatalog.getAllPages().get(i).

How do I get the text of just one page using PDFBox as I dont see any such method on PDPage class?


回答1:


You can set parameters on the PDFTextStripper to read particular pages:

PDDocument doc; // document
int i; // page no.

PDFTextStripper reader = new PDFTextStripper();
reader.setStartPage(i);
reader.setEndPage(i);
String pageText = reader.getText(doc);

As far as I'm aware, PDPage is more used with representing a page onscreen, rather than extracting text. As such, I wouldn't recommend using this to extract text.



来源:https://stackoverflow.com/questions/13563482/read-text-from-a-particular-page-using-pdfbox

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!