Is there any way to read .docx file include auto numbering using python-docx

后端 未结 2 645
天命终不由人
天命终不由人 2021-02-01 06:49

Problem statement: Extract sections from .docx file including autonumbering.

I tried python-docx to extract text from .docx file but it excludes the autonumbering.

2条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-02-01 07:12

    There is a package, docx2python which does this in a lot simpler fashion: pypi.org/project/docx2python/

    The following code:

    from docx2python import docx2python
    document = docx2python("C:/input/MyDoc.docx")
    print(document.body)
    

    produces a list which contains the contents including bullet lists in a nice parse-able fashion.

提交回复
热议问题