unicode decode error: how to skip invalid characters

前端 未结 2 1829
走了就别回头了
走了就别回头了 2020-12-17 04:54

Is there any way to preprocess text files and skip these characters?

UnicodeDecodeError: \'utf8\' codec can\'t decode byte 0xa1 in position 1395: invalid sta         


        
相关标签:
2条回答
  • 2020-12-17 05:04

    Try this:

    str.decode('utf-8',errors='ignore')
    
    0 讨论(0)
  • 2020-12-17 05:09

    I think your text file have some special character, so 'utf-8' can't decode.

    You need to try using 'ISO-8859-1' instead of 'utf-8'. like this:

       import sys
       reload(sys).setdefaultencoding("ISO-8859-1")
    
       # put your code here
    
    0 讨论(0)
提交回复
热议问题