problems reading correctly a csv due to UnicodeDecodeError in Python3

空扰寡人 提交于 2020-05-17 14:44:50

问题


I create a csv file in wich I put some lyrics of songs, using this:

with io.open('songs.csv', 'a+',encoding='utf-8') as file:
    writer = csv.writer(file , dialect='excel')
    writer.writerow(input_row)

the csv ( opened with excel) is quite strange - I don't know how to upload files here so please sorry for the pic. -

As you can see, the delimiters for the csv are commas, (the columns should be Artist, Album, Title, Lyric )

I noticed that I had some spanish and italian lyrics, and characters like 'à' , or 'è' or 'ç' were undecoded. so I change manually (on excel ) those chars. now the command

df = pd.read_csv(r'C:/Users........' , names = ['Artist', 'Album', 'Title', 'lyrics'],encoding = 'utf-8')

doesn't work:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 14: invalid continuation byte

this is the first problem, but I understand it. the point is that EVERY single encoding makes me read the characters, and that's ok, but the dataframe becomes like this: -again sorry for the screenshot but was impossible to print a df.head() clearer that a photo.-

来源:https://stackoverflow.com/questions/61761587/problems-reading-correctly-a-csv-due-to-unicodedecodeerror-in-python3

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!