PDFBOX: Convert a pdf to text or html, including images from the pdf
问题 I am developing a mobile application that converts pdf to html. I found PDFBox, which works very well. I obtained the PDF text or html on one side and the other images. But I want to go a little further, I need the generated html contains the images in the pdf. Can it be done with PDFBox? How? If you know of another free library function to do this, tell me. Thanks in advance. 回答1: Take a look at ExtractImages.java - this will guide you on how to extract images from PDF file. Next investigate