csv read raises “UnicodeDecodeError: 'charmap' codec can't decode…”

旧街凉风 提交于 2020-08-20 04:07:36

问题


I've read every post I can find, but my situation seems unique. I'm totally new to Python so this could be basic. I'm getting the following error:

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 70: character maps to undefined

When I run the code:

import csv

input_file = 'input.csv'
output_file = 'output.csv'
cols_to_remove = [4, 6, 8, 9, 10, 11,13, 14, 19, 20, 21, 22, 23, 24]

cols_to_remove = sorted(cols_to_remove, reverse=True)
row_count = 0 # Current amount of rows processed

with open(input_file, "r") as source:
    reader = csv.reader(source)
    with open(output_file, "w", newline='') as result:
        writer = csv.writer(result)
        for row in reader:
            row_count += 1
            print('\r{0}'.format(row_count), end='')
            for col_index in cols_to_remove:
                del row[col_index]
            writer.writerow(row)

What am I doing wrong?


回答1:


In Python 3, the csv module processes the file as unicode strings, and because of that has to first decode the input file. You can use the exact encoding if you know it, or just use Latin1 because it maps every byte to the unicode character with same code point, so that decoding+encoding keep the byte values unchanged. Your code could become:

...
with open(input_file, "r", encoding='Latin1') as source:
    reader = csv.reader(source)
    with open(output_file, "w", newline='', encoding='Latin1') as result:
        ...



回答2:


Add encoding="utf8" while opening file. Try below instead:

with open(input_file, "r", encoding="utf8") as source:
    reader = csv.reader(source)
    with open(output_file, "w", newline='', encoding="utf8") as result:



回答3:


  1. Try pandas

input_file = pandas.read_csv('input.csv') output_file = pandas.read_csv('output.csv')

  1. Try saving the file again as CSV UTF-8


来源:https://stackoverflow.com/questions/59082843/csv-read-raises-unicodedecodeerror-charmap-codec-cant-decode

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!