Problem statement: Extract sections from .docx file including autonumbering.
I tried python-docx to extract text from .docx file but it excludes the autonumbering.
There is a package, docx2python which does this in a lot simpler fashion: pypi.org/project/docx2python/
The following code:
from docx2python import docx2python
document = docx2python("C:/input/MyDoc.docx")
print(document.body)
produces a list which contains the contents including bullet lists in a nice parse-able fashion.