Python ASCII codec can't encode character error during write to CSV

后端 未结 2 1291
悲哀的现实
悲哀的现实 2020-12-05 21:23

I\'m not entirely sure what I need to do about this error. I assumed that it had to do with needing to add .encode(\'utf-8\'). But I\'m not entirely sure if that\'s what I n

2条回答
  •  清歌不尽
    2020-12-05 21:45

    Python 2.x CSV library is broken. You have three options. In order of complexity:

    1. Edit: See below Use the fixed library https://github.com/jdunck/python-unicodecsv (pip install unicodecsv). Use as a drop-in replacement - Example:

      with open("myfile.csv", 'rb') as my_file:    
          r = unicodecsv.DictReader(my_file, encoding='utf-8')
      

    1. Read the CSV manual regarding Unicode: https://docs.python.org/2/library/csv.html (See examples at the bottom)

    2. Manually encode each item as UTF-8:

      for cell in row.findAll('td'):
          text = cell.text.replace('[','').replace(']','')
          list_of_cells.append(text.encode("utf-8"))
      

    Edit, I found python-unicodecsv is also broken when reading UTF-16. It complains about any 0x00 bytes.

    Instead, use https://github.com/ryanhiebert/backports.csv, which more closely resembles Python 3 implementation and uses io module..

    Install:

    pip install backports.csv
    

    Usage:

    from backports import csv
    import io
    
    with io.open(filename, encoding='utf-8') as f:
        r = csv.reader(f):
    

提交回复
热议问题