get thumbnail of word in java using Apache POI

試著忘記壹切 提交于 2019-12-13 06:00:56

问题


I study on a web sharing project in jsf.In this project users can upload documents such as .doc,.pdf,.ppt,..etc . I want show this documents first pages as a thumbnail. After some googling around I found Apache POI.Can anybody has any suggestion for my problem? How can I return thumbnail image of word doc's first page? I try this code.This code just get first picture that word doc contains:

        POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream("d:\\test.doc"));
        HWPFDocument doc = new HWPFDocument(fs);
        PicturesTable pt=doc.getPicturesTable();
        List<Picture> p=pt.getAllPictures();
        BufferedImage image=ImageIO.read(new ByteArrayInputStream(p.get(0).getContent()));
        ImageIO.write(image, "JPG", new File("d:\\test.jpg"));

回答1:


What's you are doing make nothing. HWPFDocument can extract thumbnail embedded in document (when saving files, check on 'add preview' option). So HWPFDocument can extract only thumbnail of documents having thumbnail.

Even, to do that, you have to make: {code}

static byte[] process(File docFile) throws Exception {
    final HWPFDocumentCore wordDocument = AbstractWordUtils.loadDoc(docFile);
    SummaryInformation summaryInformation = wordDocument.getSummaryInformation();
    System.out.println(summaryInformation.getAuthor());
    System.out.println(summaryInformation.getApplicationName() + ":" + summaryInformation.getTitle());
    Thumbnail thumbnail = new Thumbnail(summaryInformation.getThumbnail());
    System.out.println(thumbnail.getClipboardFormat());
    System.out.println(thumbnail.getClipboardFormatTag());
    return thumbnail.getThumbnailAsWMF();
}

{code} after that, you have to probably convert WMF file format to more common format (jpeg, png...). ImageMagick can help.



来源:https://stackoverflow.com/questions/13673337/get-thumbnail-of-word-in-java-using-apache-poi

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!