How to extract the url in hyperlinks from a docx file using python

前端 未结 5 1188
清酒与你
清酒与你 2020-12-18 11:21

I\'ve been trying to find out how to get urls from a docx file using python, but failed to find anything, i\'ve tried python-docx, and python-docx2txt, but python-docx only

5条回答
  •  春和景丽
    2020-12-18 12:12

    I solved it using the following code to print the hyperlink content from docx

    from docx import Document
    from docx.opc.constants import RELATIONSHIP_TYPE as RT
    
    document = Document('test.docx')
    rels = document.part.rels
    
    def iter_hyperlink_rels(rels):
        for rel in rels:
            if rels[rel].reltype == RT.HYPERLINK:
                yield rels[rel]._target      
    
    print(iter_hyperlink_rels(rels)
    

提交回复
热议问题