Python Throwing “'utf8' codec can't decode byte 0xd0 in position 0” Error

北战南征 提交于 2019-12-10 02:29:18

问题


I am trying to load a currently existing worksheet and import the text file (comma separated values) screenshot shown below,

Excel Sheet:

Text File:

I am using the code shown below:

# importing necessary modules for performing the required operation
    import glob
    import csv
    from openpyxl import load_workbook
    import xlwt

    #read the text file(s) using the CSV modules and read the dilimiters and quoutechar
    for filename in glob.glob("E:\Scripting_Test\Phase1\*.txt"):
        spamReader = csv.reader((open(filename, 'rb')), delimiter=',')


        #read the excel file and using xlwt modules and set the active sheet
        wb = load_workbook(filename=r"E:\Scripting_Test\SeqTem\Seq0001.xls")
        ws = wb.worksheets(0)


        #write the data that is in text file to excel file
        for rowx, row in enumerate(spamReader):
            for colx, value in enumerate(row):
                ws.write(rowx, colx, value)

        wb.save()

I am getting a following error message:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xd0 in position 0: invalid continuation byte

One more question: How can you tell python to import the text data starting from A3 column in the excel sheet?


回答1:


Unicode encoding confuses me, but can't you force the value to ignore invalid bytes by saying:

value = unicode(value, errors='ignore')

Here is a great answer for more reading on unicode: unicode().decode('utf-8', 'ignore') raising UnicodeEncodeError




回答2:


openpyxl only deals with OOXML format (xlsx/xlsm). Please try to save as xlsx file format instead of xls by using Excel.

If you want to convert xls file to xlsx in codes. Please try one option from the below list:

  1. In Windows, you can also use excelcnv tool to convert xls to xlxx.
  2. In Linux, please check this article.
  3. Or, you could convert to xlsx by using xlrd in Python. Please check this Q&A.



回答3:


Hi Are you sure you don't have a doc that has UTF-8 BOM

You might try using with UTF-8 BOM codec. Generally Windows+UTF+8 can be a bit troublesome. Although that character that it's showing may not be the BOM.



来源:https://stackoverflow.com/questions/19117836/python-throwing-utf8-codec-cant-decode-byte-0xd0-in-position-0-error

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!