I\'m trying to use python-docx module (pip install python-docx) but it seems to be very confusing as in github repo test sample they are using
python-docx
pip install python-docx
You can use python-docx2txt which is adapted from python-docx but can also extract text from links, headers and footers. It can also extract images.