can someone explain to me the use of unicode_escape as an encoding argument in python 3.6?

杀马特。学长 韩版系。学妹 提交于 2020-12-11 06:41:00

问题


I work with large pandas dataframes on a daily basis, which gets fed information that we parse from a webAPI (xml encoding is utf-8) local to our network.

After I feed the dataframe and export as a csv file I start getting encoding errors (local machine is cp1252) which I've had to deal with the past few weeks.

The solution I finally found was [here][1] under tangfucious's response.

    df['crumbs'] = df['crumbs'].map(lambda x: x.encode('unicode-escape').decode('utf-8'))

a line of code that takes a string and encodes it using .encode=('unicode_escape'), decoding into utf-8 after.

Can someone explain to me how this code works? Unfortunately, I'm a noob and new to SO so I wasn't able to comment on his response

What is the purpose of unicode-escape under the hood (aside from the obvious, adding a \ to each unicode code point).? How does this affect decoding into utf-8? Why is this necessary? Isn't it always better to encode/decode using the same encoding?

Is there another use in using 'unicode_escape'?

来源:https://stackoverflow.com/questions/41967354/can-someone-explain-to-me-the-use-of-unicode-escape-as-an-encoding-argument-in-p

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!