efficiently replace bad characters

前端 未结 6 1235
梦毁少年i
梦毁少年i 2020-12-07 21:28

I often work with utf-8 text containing characters like:

\\xc2\\x99

\\xc2\\x95

\\xc2\\x85

etc

<
6条回答
  •  爱一瞬间的悲伤
    2020-12-07 22:00

    These characters are not in ASCII Library and that is the reason why you are getting the errors. To avoid these errors, you can do the following while reading the file.

    import codecs   
    f = codecs.open('file.txt', 'r',encoding='utf-8')
    

    To know more about these kind of errors, go through this link.

提交回复
热议问题