Reading a UTF8 CSV file with Python

后端 未结 9 1623
青春惊慌失措
青春惊慌失措 2020-11-22 12:20

I am trying to read a CSV file with accented characters with Python (only French and/or Spanish characters). Based on the Python 2.5 documentation for the csvreader (http://

9条回答
  •  刺人心
    刺人心 (楼主)
    2020-11-22 12:56

    Looking at the Latin-1 unicode table, I see the character code 00E9 "LATIN SMALL LETTER E WITH ACUTE". This is the accented character in your sample data. A simple test in Python shows that UTF-8 encoding for this character is different from the unicode (almost UTF-16) encoding.

    >>> u'\u00e9'
    u'\xe9'
    >>> u'\u00e9'.encode('utf-8')
    '\xc3\xa9'
    >>> 
    

    I suggest you try to encode("UTF-8") the unicode data before calling the special unicode_csv_reader(). Simply reading the data from a file might hide the encoding, so check the actual character values.

提交回复
热议问题