Apache PDFBox convert pdf to images

匿名 (未验证) 提交于 2019-12-03 01:20:02

问题:

Can someone give me an example on how to use Apache PDFBox to convert a pdf in different images (one for each page of the pdf). Thanks in advance

回答1:

Solution for 1.8.* versions:

PDDocument document = PDDocument.loadNonSeq(new File(pdfFilename), null); List pdPages = document.getDocumentCatalog().getAllPages(); int page = 0; for (PDPage pdPage : pdPages) {      ++page;     BufferedImage bim = pdPage.convertToImage(BufferedImage.TYPE_INT_RGB, 300);     ImageIOUtil.writeImage(bim, pdfFilename + "-" + page + ".png", 300); } document.close();

Don't forget to read the 1.8 dependencies page before doing your build.

Solution for the 2.0 version:

PDDocument document = PDDocument.load(new File(pdfFilename)); PDFRenderer pdfRenderer = new PDFRenderer(document); for (int page = 0; page 

The ImageIOUtil class is in a separate download / artifact (pdf-tools). Read the 2.0 dependencies page before doing your build, you'll need extra jar files for PDFs with jbig2 images, for saving to tiff images, and reading of encrypted files.

If you are using JDK8, set -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider or it will be very slow.



回答2:

w/o any extra dependencies you can just use the PDFToImage class already included in PDFBox.

Kotlin:

PDFToImage.main(arrayOf("-outputPrefix", "newImgFilenamePrefix", existingPdfFilename))

other config opts: https://pdfbox.apache.org/docs/2.0.8/javadocs/org/apache/pdfbox/tools/PDFToImage.html



易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!