Converting docx to pdf with pure python (on linux, without libreoffice)

前端 未结 2 1731
醉酒成梦
醉酒成梦 2020-12-15 20:27

I\'m dealing with a problem trying to develop a web-app, part of which converts uploaded docx files to pdf files (after some processing). With python-docx and o

相关标签:
2条回答
  • 2020-12-15 21:27

    Another one you could use is libreoffice, however as the first responder said the quality will never be as good as using the actual comtypes.

    anyways, after you have installed libreoffice, here is the code to do it.

    from subprocess import  Popen
    LIBRE_OFFICE = r"C:\Program Files\LibreOffice\program\soffice.exe"
    
    def convert_to_pdf(input_docx, out_folder):
        p = Popen([LIBRE_OFFICE, '--headless', '--convert-to', 'pdf', '--outdir',
                   out_folder, input_docx])
        print([LIBRE_OFFICE, '--convert-to', 'pdf', input_docx])
        p.communicate()
    
    
    sample_doc = 'file.docx'
    out_folder = 'some_folder'
    convert_to_pdf(sample_doc, out_folder)
    
    0 讨论(0)
  • 2020-12-15 21:33

    The PythonAnywhere help pages offer information on working with PDF files here: https://help.pythonanywhere.com/pages/PDF

    Summary: PythonAnywhere has a number of Python packages for PDF manipulation installed, and one of them may do what you want. However, shelling out to abiword seems easiest to me. The shell command abiword --to=pdf filetoconvert.docx will convert the docx file to a PDF and produce a file named filetoconvert.pdf in the same directory as the docx. Note that this command will output an error message to the standard error stream complaining about XDG_RUNTIME_DIR (or at least it did for me), but it still works, and the error message can be ignored.

    0 讨论(0)
提交回复
热议问题