How can I convert PDF to HTML?

前端 未结 9 590
慢半拍i
慢半拍i 2020-12-12 20:09

What good libraries are there, in any common language, for converting PDF to HTML?

相关标签:
9条回答
  • 2020-12-12 21:08

    In linux install pdftohtml - For batch convertion of all files in a folder use:

    ls *.pdf | xargs -I{} pdftohtml {}
    

    This will create html site with all references and images from original documents. Every page in a separate html file. Very useful to convert project documentation to search for files by phrase, using common system file search.

    0 讨论(0)
  • 2020-12-12 21:08

    Given the vagueness of the original question I'm going to go ahead and give a solution that will work with any language that can execute command-line apps. Although it can be a little bit tricky to get setup, OpenOffice can be run in headless mode on a server and, with the help of jodconverter, can convert any file format to any other file format (well, any format conversions that openoffice can handle, that is).

    Here are a couple of links that help with the setup:

    • http://iwonderdesigns.posterous.com/how-to-run-jodconverteropenoffice-on-your-hos
    • http://www.artofsolving.com/node/10
    0 讨论(0)
  • 2020-12-12 21:12

    if you're looking for a way to convert PDF to HTML once or twice then I recommend Adobe Online Conversion

    If it's an API you're after then http://www.pdfonline.com/ has an SDK that should suit your needs.

    If it's a library you're after then please let us know which server-side language you prefer.

    0 讨论(0)
提交回复
热议问题