Converting a PDF into multiple JPGs with iText or other

别来无恙 提交于 2020-01-14 06:56:28

问题


I have the need to convert any multipage PDF file into a set of JPGs.

Since the PDF files are supposed to come from a scanner, we can assume each page just contains a graphic object to extract, but I cannot be 100% sure of that.

So, I need to convert any renderable content from each page into a single JPEG file.

How can I do this with iText?

If I can't do this with iText, what Java library can achieve this?

Thanks.


回答1:


ICEpdf - http://www.icepdf.org/ - has an open source entry version which should do what you need.

I believe the primary difference between the open source version and the pay-for version is that the pay-for has much better font support.




回答2:


Ghostscript (available for Windows, Linux, MacOS X, Solaris, AIX,...) can convert...

  • ...from input formats: PDF, PostScript, EPS and AI
  • ...into output formats: JPEG, TIFF, PNG, PNM, PPM, BMP, (and more).

(The ImageMagick mentioned above doesn't do the conversion on its own -- it uses Ghostscript under the hood, as do many other tools.)




回答3:


You can also use Sun's PDF-Renderer and JPedal does PDF to image (low and high res.




回答4:


With Apache PDFBox you could do the following:

PDDocument document = PDDocument.load(pdffile);
List<PDPage> pages = document.getDocumentCatalog().getAllPages();
for (int i = 0; i < pages.size(); i++) {
  PDPage page = pages.get(i);
  BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 72);
  ImageIO.write(image, "jpg", new File(pdffile.getAbsolutePath() + "_" + i + ".jpg"));
}


来源:https://stackoverflow.com/questions/6421967/converting-a-pdf-into-multiple-jpgs-with-itext-or-other

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!