multiple regex substitution in multiple files using python [closed]

独自空忆成欢 提交于 2019-12-01 14:46:07

Is this what you're looking for?

Unfortunately you didn't specify how you know which regexes are supposed to be applied, so I put them into a list of tuples (first element is the regex, second is the replacement text).

import os, os.path, re

path = "/Users/mypath/testData"
myfiles = os.listdir(path)
# its much faster if you compile your regexes before you
# actually use them in a loop
REGEXES = [(re.compile(r'dog'), 'cat'),
           (re.compile(r'123'), '789')]
for f in myfiles:
    # split the filename and file extension for use in
    # renaming the output file
    file_name, file_extension = os.path.splitext(f)
    generated_output_file = file_name + "_regex" + file_extension

    # As l4mpi said ... if odt is zipped, you'd need to unzip it first
    # re.search is slower than a simple if statement
    if file_extension in ('.txt', '.doc', '.odt', '.htm', '.html'):

        # Declare input and output files, open them,
        # and start working on each line.
        input_file = os.path.join(path, f)
        output_file = os.path.join(path, generated_output_file)

        with open(input_file, "r") as fi, open(output_file, "w") as fo:
            for line in fi:
                for search, replace in REGEXES:
                    line = search.sub(replace, line)
                fo.write(line)
        # both the input and output files are closed automatically
        # after the with statement closes
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!