Python 2.7 - find and replace from text file, using dictionary, to new text file

后端未结

关注

 3  889

别跟我提以往 2021-01-05 17:27

I am newbie to programming, and have been studying python in my spare time for the past few months. I decided I was going to try and create a little script that converts Ame

3条回答

甜味超标 (楼主)

2021-01-05 18:14
The extra blank line you are seeing is because you are using print to write out a line that already includes a newline character at the end. Since print writes its own newline too, your output becomes double spaced. An easy fix is to use outfile.write(new_line) instead.

As for the file modes, the issue is that you're opening the output file over and over. You should just open it once, at the start. Its usually a good idea to use with statements to handle opening files, since they'll take care of closing them for you when you're done with them.

I don't undestand your other issue, with only some of the replacements happening. Is your dictionary missing the spellings for 'analyze' and 'utilize'?

One suggestion I'd make is to not do your replacements line by line. You can read the whole file in at once with file.read() and then work on it as a single unit. This will probably be faster, since it won't need to loop as often over the items in your spelling dictionary (just once, rather than once per line):
```
with open('test_file.txt', 'r') as in_file:
    text = in_file.read()

with open('output_test_file.txt', 'w') as out_file:
    out_file.write(replace_all(text, spelling_dict))
```
Edit:

To make your code correctly handle words that contain other words (like "entire" containing "tire"), you probably need to abandon the simple str.replace approach in favor of regular expressions.

Here's a quickly thrown together solution that uses re.sub, given a dictionary of spelling changes from American to British English (that is, in the reverse order of your current dictionary):
```
import re

#from english_american_dictionary import ame_to_bre_spellings
ame_to_bre_spellings = {'tire':'tyre', 'color':'colour', 'utilize':'utilise'}

def replacer_factory(spelling_dict):
    def replacer(match):
        word = match.group()
        return spelling_dict.get(word, word)
    return replacer

def ame_to_bre(text):
    pattern = r'\b\w+\b'  # this pattern matches whole words only
    replacer = replacer_factory(ame_to_bre_spellings)
    return re.sub(pattern, replacer, text)

def main():
    #with open('test_file.txt') as in_file:
    #    text = in_file.read()
    text = 'foo color, entire, utilize'

    #with open('output_test_file.txt', 'w') as out_file:
    #    out_file.write(ame_to_bre(text))
    print(ame_to_bre(text))

if __name__ == '__main__':
    main()
```
One nice thing about this code structure is that you can easily convert from British English spellings back to American English ones, if you pass a dictionary in the other order to the replacer_factory function.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...