remove non ascii characters from csv file using Python

问题

I am trying to remove non-ascii characters from a file. I am actually trying to convert a text file which contains these characters (eg. hello§‚å½¢æˆ äº†å¯¹æ¯”ã€‚ èŠ±å) into a csv file.

However, I am unable to iterate through these characters and hence I want to remove them (i.e chop off or put a space). Here's the code (researched and gathered from various sources)

The problem with the code is, after running the script, the csv/txt file has not been updated. Which means the characters are still there. Have absolutely no idea how to go about doing this anymore. Researched for a day :(

Would kindly appreciate your help!

import csv

txt_file = r"xxx.txt"
csv_file = r"xxx.csv"

in_txt = csv.reader(open(txt_file, "rb"), delimiter = '\t')
out_csv = csv.writer(open(csv_file, 'wb'))
for row in in_txt:
    for i in row:
        i = "".join([a if ord(a)<128 else''for a in i])

out_csv.writerows(in_txt)

回答1:

Variable assignment is not magically transferred to the original source; you have to build up a new list of your changed rows:

import csv

txt_file = r"xxx.txt"
csv_file = r"xxx.csv"

in_txt = csv.reader(open(txt_file, "rb"), delimiter = '\t')
out_csv = csv.writer(open(csv_file, 'wb'))
out_txt = []
for row in in_txt:
    out_txt.append([
        "".join(a if ord(a) < 128 else '' for a in i)
        for i in row
    ]

out_csv.writerows(out_txt)

来源：https://stackoverflow.com/questions/37457277/remove-non-ascii-characters-from-csv-file-using-python

标签

python

csv

unicode

ascii

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!