How to read DOCX using Apache POI in page by page mode

前提是你 提交于 2019-12-11 06:05:17

问题


I would like to read a docx files to search for a particular text. I would like the program to print the page on which it was found and the document name. I have written this simple method, but it doesn't count any page:

     private static void searchDocx(File file, String searchText) throws IOException {
        FileInputStream fis = new FileInputStream(file.getAbsolutePath());
        XWPFDocument document = new XWPFDocument(fis);

        int pageNo = 1;
        for (XWPFParagraph paragraph : document.getParagraphs()) {

            String text = paragraph.getText();
            if (text != null) {
                if (text.toLowerCase().contains(searchText.toLowerCase())) {
                    System.out.println("found on page: " + pageNo+ " in: " + file.getAbsolutePath());
                }
            }
            if (paragraph.isPageBreak()) {
                pageNo++;
            }
        }
    }

How to read the file, to be able to print the information on which page the searchText was found? Is there any way to know the page when reading the docx using ApachePOI?

来源:https://stackoverflow.com/questions/44300740/how-to-read-docx-using-apache-poi-in-page-by-page-mode

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!