How to extract text from an existing docx file using python-docx

后端 未结 7 1111
不思量自难忘°
不思量自难忘° 2020-11-27 15:59

I\'m trying to use python-docx module (pip install python-docx) but it seems to be very confusing as in github repo test sample they are using

7条回答
  •  野趣味
    野趣味 (楼主)
    2020-11-27 16:29

    Without Installing python-docx

    docx is basically is a zip file with several folders and files within it. In the link below you can find a simple function to extract the text from docx file, without the need to rely on python-docx and lxml the latter being sometimes hard to install:

    http://etienned.github.io/posts/extract-text-from-word-docx-simply/

提交回复
热议问题