Remove special characters from csv file using python

前端 未结 4 549
渐次进展
渐次进展 2021-01-14 08:11

There seems to something on this topic already (How to replace all those Special Characters with white spaces in python?), but I can\'t figure this simple task out for the l

4条回答
  •  渐次进展
    2021-01-14 08:47

    I might do something like

    import csv
    
    with open("special.csv", "rb") as infile, open("repaired.csv", "wb") as outfile:
        reader = csv.reader(infile)
        writer = csv.writer(outfile)
        conversion = set('_"/.$')
        for row in reader:
            newrow = [''.join('_' if c in conversion else c for c in entry) for entry in row]
            writer.writerow(newrow)
    

    which turns

    $ cat special.csv
    th$s,2.3/,will-be
    fixed.,even.though,maybe
    some,"shoul""dn't",be
    

    (note that I have a quoted value) into

    $ cat repaired.csv 
    th_s,2_3_,will-be
    fixed_,even_though,maybe
    some,shoul_dn't,be
    

    Right now, your code is reading in the entire text into one big line:

    text =  input.read()
    

    Starting from a _ character:

    newtext = '_'
    

    Looping over every single character in text:

    for c in text:
    

    Add the corrected character to newtext (very slowly):

        newtext += '_' if c in conversion else c
    

    And then write the original character (?), as a column, to a new csv:

        writer.writerow(c)
    

    .. which is unlikely to be what you want. :^)

提交回复
热议问题