How can I insert a checkbox form into a .docx file using python-docx?

前端未结

关注

 2  533

闹比i 2020-12-21 20:11

I\'ve been using python to implement a custom parser and use that parsed data to format a word document to be distributed internally. All of the formatting has been straight

2条回答

温柔的废话 (楼主)

2020-12-21 20:27

The key thing with these workaround functions is to have an example of XML that works, and to be able to compare the XML you generate. If you generate XML that matches the working example, it will work every time. opc-diag is handy for inspecting the XML in a Word document. Working with really small documents (like single paragraph or two-row table, for analysis purposes) makes it a lot easier to work out how Word is structuring the XML.

An important thing to note is that the XML elements in a Word document are sequence sensitive, meaning the child elements within any other element generally have a set order in which they must appear. If you get this swapped around, you get the "repair" error you mentioned.

I find it much easier to manipulate the XML from within python-docx, as it takes care of all the unzipping and rezipping for you, along with a lot of the other details.

To get the sequencing right, you'll need to be familiar with the XML Schema specifications for the elements you're working with. There is an example here: http://python-docx.readthedocs.io/en/latest/dev/analysis/features/text/paragraph-format.html

The full schema is in the code tree under ref/xsd/. Most of the elements for text are in the wml.xsd file (wml stands for WordProcessing Markup Language).

You can find examples of other so-called "workaround functions" by searching on "python-docx" workaround function. Pay particular attention to the parse_xml() function and the OxmlElement objects which will allow you to create new XML subtrees and individual elements respectively. XML elements can be positioned using regular lxml._Element methods; all XML elements in python-docx are based on lxml. http://lxml.de/api/lxml.etree._Element-class.html

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...