html to .doc converter in Python?

与世无争的帅哥 提交于 2019-11-27 14:48:20

You could use win32com from the pywin32 python extensions for windows, to let MS Word convert it for you. A simple example:

import win32com.client

word = win32com.client.Dispatch('Word.Application')

doc = word.Documents.Add('example.html')
doc.SaveAs('example.doc', FileFormat=0)
doc.Close()

word.Quit()

Though I am not aware of a direct module that can allow you to convert this, however:

  1. You can convert HTML to plain text first using the html2text module.
  2. After that, you can use this the python-docx module to convert the text to a doc or a docx file.

In case anybody else lands here attempting to convert the other way around, the above code works, but you need to modify the FileFormat value.

http://msdn.microsoft.com/en-us/library/ff839952.aspx

Example: Filtered html is 10, instead of 0.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!