How to extract images from a PDF with iText in the correct order?

前端 未结 1 1601
挽巷
挽巷 2020-12-09 05:47

I am trying to extract images from a PDF file. I found an example on the web, that worked fine:

    PdfReader reader;

    File file = new File(\"example.pdf         


        
相关标签:
1条回答
  • 2020-12-09 06:06

    I found an answer elsewhere, namely the iText mailing list.

    The following code works for me - please note that I switched to PdfBox:

    PDDocument document = null; 
    document = PDDocument.load(inFile); 
    List pages = document.getDocumentCatalog().getAllPages();
    Iterator iter = pages.iterator(); 
    while (iter.hasNext()) {
                PDPage page = (PDPage) iter.next();
                PDResources resources = page.getResources();
                Map pageImages = resources.getImages();
                if (pageImages != null) { 
                    Iterator imageIter = pageImages.keySet().iterator();
                    while (imageIter.hasNext()) {
                        String key = (String) imageIter.next();
                        PDXObjectImage image = (PDXObjectImage) pageImages.get(key);
                        image.write2OutputStream(/* some output stream */);
                    }
                }
    }
    
    0 讨论(0)
提交回复
热议问题