extract images from pdf using pdfbox

前端 未结 8 2002
刺人心
刺人心 2020-11-28 09:22

I m trying to extract images from a pdf using pdfbox. The example pdf here

But i m getting blank images only.

The code i m trying:-

public st         


        
8条回答
  •  天涯浪人
    2020-11-28 09:48

    For PDFBox 2.0.1, pudaykiran's answer must be slightly modified since some APIs have been changed.

    public static void testPDFBoxExtractImages() throws Exception {
        PDDocument document = PDDocument.load(new File("D:/Temp/Test.pdf"));
        PDPageTree list = document.getPages();
        for (PDPage page : list) {
            PDResources pdResources = page.getResources();
            for (COSName c : pdResources.getXObjectNames()) {
                PDXObject o = pdResources.getXObject(c);
                if (o instanceof org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject) {
                    File file = new File("D:/Temp/" + System.nanoTime() + ".png");
                    ImageIO.write(((org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject)o).getImage(), "png", file);
                }
            }
        }
    }
    

提交回复
热议问题