How to delete specific strings from a file?

后端 未结 5 2060
花落未央
花落未央 2020-12-02 01:22

I have a data file (unstructured, messy file) from which I have to scrub specific list of strings (delete strings).

Here is what I am doing but with no result:

5条回答
  •  难免孤独
    2020-12-02 01:54

    To the OP, Ross Patterson's method above works perfectly for me, i.e.

    infile = "messy_data_file.txt"
    outfile = "cleaned_file.txt"
    
    delete_list = ["word_1", "word_2", "word_n"]
    fin = open(infile)
    fout = open(outfile, "w+")
    for line in fin:
        for word in delete_list:
            line = line.replace(word, "")
        fout.write(line)
    fin.close()
    fout.close()
    

    Example:

    I have a file named messy_data_file.txt that includes the following words (animals), not necessarily on the same line. Like this:

    Goat
    Elephant
    Horse Donkey Giraffe
    Lizard
    Bird
    Fish
    

    When I modify the code to read (actually just adding the words to delete to the "delete_list" line):

    infile = "messy_data_file.txt"
    outfile = "cleaned_file.txt"
    
    delete_list = ["Donkey", "Goat", "Fish"]
    fin = open(infile)
    fout = open(outfile, "w+")
    for line in fin:
        for word in delete_list:
           line = line.replace(word, "")
        fout.write(line)
    fin.close()
    fout.close()
    

    The resulting "cleaned_file.txt" looks like this:

    Elephant
    Horse  Giraffe
    Lizard
    Bird
    

    There is a blank line where "Goat" used to be (where, oddly, removing "Donkey" did not) but for my purposes, this works fine.

    I also add input("Press Enter to exit...") the the very end of the code to keep the command line window from opening and slamming shut on me when I'm double-clicking the remove_text.py file to run it, but take note that you'll catch no errors this way.

    To do that I run it from the command line (where C:\Just_Testing is the directory where all my files are, i.e. remove_text.py and messy_text.txt) like this:

    C:\Just_Testing\>py remove_text.py 
    

    or

    C:\Just_Testing>python remove_text.py 
    

    works exactly the same.

    Of course, like when writing HTML, I guess it never hurts to use a fully qualified path when running py or python from somewhere other than the directory you happen to be sitting in, such as:

    C:\Windows\System32\>python C:\Users\Me\Desktop\remove_text.py
    

    Of course in the code it would be:

    infile = "C:\Users\Me\Desktop\messy_data_file.txt"
    outfile = "C:\Users\Me\Desktop\cleaned_file.txt"
    

    Be careful to use the same fully qualified path to place your newly created cleaned_file.txt in or it will be created wherever you may be and that could cause confusion when looking for it.

    Personally, I have the PATH in my Environment Variables set to point to all my Python installs i.e. C:\Python3.5.3, C:\Python2.7.13, etc. so I can run py or python from anywhere.

    Anyway, I hope making fine-tuning adjustments to this code from Mr. Patterson can get you exactly what you need. :)

    .

提交回复
热议问题