Python read csv - BOM embedded into the first key

爷,独闯天下 提交于 2019-11-26 14:27:11

问题


I'm using Python 2.7.12. With this code snippet I'm saving a utf-8 csv file. I wrote the BOM (byte order mark) at the beginning of the file.

import codecs
import csv

outputFile = open("test.csv", "wb")
outputFile.write(codecs.BOM_UTF8)
fieldnames = ["a", "b"]
writer = csv.DictWriter(outputFile, fieldnames, delimiter=";")
writer.writeheader()
row = dict([])
for i in range(10):
    row["a"] = str(i).encode("utf-8")
    row["b"] = str(i*2).encode("utf-8")
    writer.writerow(row)
outputFile.close()

I want to load that csv file:

import codecs
import csv
inputFile = open("test.csv", "rb")
reader = csv.DictReader(inputFile, delimiter=";")
for row in reader:
    print row["a"]
inputFile.close()

The above code is going to fail: KeyError: 'a' If I print the row keys this is how they look: [u'\ufeffa', u'b']. The BOM has been embedded into the key a. What am I doing wrong?


回答1:


You have to tell open that this is UTF-8 with BOM. I know that works with io.open:

import io

.
.
.
inputFile = io.open("test.csv", "r", encoding='utf-8-sig')
.
.
.

And you have to open the file in text mode, "r" instead of "rb".



来源:https://stackoverflow.com/questions/40310042/python-read-csv-bom-embedded-into-the-first-key

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!