How to use python-docx to replace text in a Word document and save

后端 未结 7 1940
后悔当初
后悔当初 2020-11-29 19:25

The oodocx module mentioned in the same page refers the user to an /examples folder that does not seem to be there.
I have read the documentation of python-docx 0.7.2, p

7条回答
  •  萌比男神i
    2020-11-29 20:15

    I needed something to replace regular expressions in docx. I took scannys answer. To handle style I've used answer from: Python docx Replace string in paragraph while keeping style added recursive call to handle nested tables. and came up with something like this:

    import re
    from docx import Document
    
    def docx_replace_regex(doc_obj, regex , replace):
    
        for p in doc_obj.paragraphs:
            if regex.search(p.text):
                inline = p.runs
                # Loop added to work with runs (strings with same style)
                for i in range(len(inline)):
                    if regex.search(inline[i].text):
                        text = regex.sub(replace, inline[i].text)
                        inline[i].text = text
    
        for table in doc_obj.tables:
            for row in table.rows:
                for cell in row.cells:
                    docx_replace_regex(cell, regex , replace)
    
    
    
    regex1 = re.compile(r"your regex")
    replace1 = r"your replace string"
    filename = "test.docx"
    doc = Document(filename)
    docx_replace_regex(doc, regex1 , replace1)
    doc.save('result1.docx')
    

    To iterate over dictionary:

    for word, replacement in dictionary.items():
        word_re=re.compile(word)
        docx_replace_regex(doc, word_re , replacement)
    

    Note that this solution will replace regex only if whole regex has same style in document.

    Also if text is edited after saving same style text might be in separate runs. For example if you open document that has "testabcd" string and you change it to "test1abcd" and save, even dough its the same style there are 3 separate runs "test", "1", and "abcd", in this case replacement of test1 won't work.

    This is for tracking changes in the document. To marge it to one run, in Word you need to go to "Options", "Trust Center" and in "Privacy Options" unthick "Store random numbers to improve combine accuracy" and save the document.

提交回复
热议问题