Extracting MS Word Table Cell as image?

时光总嘲笑我的痴心妄想 提交于 2019-12-08 22:01:09

问题


I need to extract table cells as images. The cells may contain mixed content (Text + Image), which I need to merge into a single image. I am able to get the core text but I have no idea to get an image+text. Not sure if Apace POI would help.

Has anyone done something like this earlier?

  public static void readTablesDataInDocx(XWPFDocument doc) {
    int tableIdx = 1;
    int rowIdx = 1;
    int colIdx = 1;
    List table = doc.getTables();
    System.out.println("==========No Of Tables in Document=============================================" + table.size());
    for (int k = 0; k < table.size(); k++) {
        XWPFTable xwpfTable = (XWPFTable) table.get(k);
        System.out.println("================table -" + tableIdx + "===Data==");
        rowIdx = 1;
        List row = xwpfTable.getRows();
        for (int j = 0; j < row.size(); j++) {
            XWPFTableRow xwpfTableRow = (XWPFTableRow) row.get(j);
            System.out.println("Row -" + rowIdx);
            colIdx = 1;
            List cell = xwpfTableRow.getTableCells();
            for (int i = 0; i < cell.size(); i++) {
                XWPFTableCell xwpfTableCell = (XWPFTableCell) cell.get(i);
                if (xwpfTableCell != null) {
                    System.out.print("\t" + colIdx + "- column value: " + xwpfTableCell.getText());
                }
                colIdx++;
            }
            System.out.println("");
            rowIdx++;
        }
        tableIdx++;
        System.out.println("");
    }
}

Now I am able to get Text with the help of this method

System.out.print("\t" + colIdx + "- column value: " + xwpfTableCell.getText());

How do I get the Image if a cell also contains one?


回答1:


Try this code, it's working for me

 XWPFDocument doc = new XWPFDocument(new FileInputStream(fileName));
            List<XWPFTable> table = doc.getTables();
            for (XWPFTable xwpfTable : table) {
                List<XWPFTableRow> row = xwpfTable.getRows();
                for (XWPFTableRow xwpfTableRow : row) {
                    List<XWPFTableCell> cell = xwpfTableRow.getTableCells();
                    for (XWPFTableCell xwpfTableCell : cell) {
                        if (xwpfTableCell != null) {
                            System.out.println(xwpfTableCell.getText());
                            String s = xwpfTableCell.getText();
                            for (XWPFParagraph p : xwpfTableCell.getParagraphs()) {
                                for (XWPFRun run : p.getRuns()) {
                                    for (XWPFPicture pic : run.getEmbeddedPictures()) {
                                        byte[] pictureData = pic.getPictureData().getData();
                                        System.out.println("picture : " + pictureData);
                                    }
                                }
                            }
                        }
                    }
                }
            }



回答2:


When you have a Cell, you can get hold of the paragraphs that form that Cell. These paragraphs are in turn formed by Runs, which you can obtain by calling the getRuns method. Runs themselves can contain embedded images, which you can obtain by calling the getEmbeddedPictures method.

You can therefore have a method that gets the embedded pictures of a cell:

public static void printDescriptionOfImagesInCell(XWPFTableCell cell) {
    List<XWPFParagraph> paragrahs = cell.getParagraphs();
    for (XWPFParagraph paragraph : paragraphs) {
        List<XWPFRun> runs = paragraph.getRuns();
        for (XWPFRun run : runs) {
            List<XWPFPicture> pictures = run.getEmbeddedPictures();
            for (XWPFPicture picture : pictures) {
                //Do anything you want with the picture:
                System.out.println("Picture: " + picture.getDescription());
            }
        }
    }
}

You should be able to discover more things about the actual pictures with the Picture documentation, and change the method to actually get the image data, name, etc.



来源:https://stackoverflow.com/questions/37961845/extracting-ms-word-table-cell-as-image

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!