Apache POI: Retrieve page number from XWPFParagraph instance?

冷暖自知 提交于 2020-01-06 19:29:34

问题


I am iterating over XWPFParagraph instances coming from an XWPTDocument instance (using the "getParagraphs()" method) Is there a way to retrieve the page numbers where each paragraph is located from the XWPFParagraph instances?


回答1:


To eventually turn Gagravarr's comment into a proper answer: No, this is not possible.

Doing so would require a full-blown Word rendering engine (i.e. MS Word itself) and even then you cannot be absolutely sure that page breaks will always occur at exactly those positions where they happened to be when the file was once created (think: missing fonts, missing pictures, different display options for vanished text and/or revision marking, different printer margins, etc.).

So claiming that some content in a Word file is on a certain line X on a certain page Y actually expresses a fundamental misunderstanding of the Word file format. There is simply no notion of line and page in there. It's all about runs resp. ranges.

In other words: Only upon opening such a file with MS Word will those contents be rendered onto individual lines / pages. And the behavior of this renderer unpredictable to a certain extent.



来源:https://stackoverflow.com/questions/30478540/apache-poi-retrieve-page-number-from-xwpfparagraph-instance

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!