PDFBOX: Convert a pdf to text or html, including images from the pdf

a 夏天 提交于 2019-12-12 02:34:21

问题


I am developing a mobile application that converts pdf to html. I found PDFBox, which works very well. I obtained the PDF text or html on one side and the other images. But I want to go a little further, I need the generated html contains the images in the pdf. Can it be done with PDFBox? How? If you know of another free library function to do this, tell me.

Thanks in advance.


回答1:


Take a look at ExtractImages.java - this will guide you on how to extract images from PDF file.

Next investigate the PrintImageLocations.java example - you will need those locations to properly format HTML file.



来源:https://stackoverflow.com/questions/9671239/pdfbox-convert-a-pdf-to-text-or-html-including-images-from-the-pdf

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!