问题
PDF.js is the latest library from Mozilla, and is a standards-based PDF renderer that is written entirely in Javascript. Currently you cannot access the generated HTML, and the library can only be used as a viewer. Is it possible to use PDF.js to statically convert a PDF to its HTML equivalent? Considering it renders in a browser, it must be HTML+CSS, and the JS would be used only for navigation.
After converting it to HTML I plan to use our existing HTML workflow to import/index/consume the page as if it were an ordinary HTML webpage.
回答1:
Note: this is for the original question, as well as for others who may be visiting this for related help, as was the case with me. ;)
Answer:
You may try: Poppler or pdf2htmlEX which is based on Poppler.
I'd recommend looking at the pdf2htmlEX documentation it also has as very good comparison table.
回答2:
pdf.js renders to Canvas so it can't be used to statically convert a PDF to HTML
回答3:
DocPub is powered by PDFNet, a PDF SDK with C# support, which supports converting PDF to HTML offline.
WebViewer from the same company is an HTML5-based PDF viewer that renders documents on-the-fly within the browser.
WebViewer works with all major Web platforms; the viewer can be directly embedded and customized within any HTML5, Silverlight, or Flash application. The content can be instantly accessed from any system or device - including iPad/iPhone (iOS), Android, Windows (desktop & tablets), WP8, Linux, Mac, etc. -- demo
回答4:
AccuSoft has an HTML5-based PDF/DOC viewer called Prizm. I don't think this can convert the PDF statically to HTML, but it looks like a functional HTML5-based viewer. I have no experience with it, but the online HTML5 demo (the link) looks pretty impressive. They claim it can be used on PC & Mobile for great rendering of such files.
Accusoft HTML5 viewing technology can display virtually any document file—DOC, PDF, PPT, CAD and dozens more—through the native browser on almost any smartphone or tablet, with no additional apps or players required on users’ devices.
来源:https://stackoverflow.com/questions/16785198/use-pdf-js-to-statically-convert-a-pdf-to-html