Python HTML Encoding \xc2\xa0

女生的网名这么多〃 提交于 2020-01-12 13:56:00

问题


I've been struggling with this one for a while. I'm trying to write strings to HTML but have issues with the format once I've cleaned them. Here's an example:

paragraphs = ['Grocery giant and household name Woolworths is battered and bruised. ', 
'But behind the problems are still the makings of a formidable company']

x = str(" ")
for item in paragraphs:
    x = x + str(item)
x

Output:

"Grocery giant and household name\xc2\xa0Woolworths is battered and\xc2\xa0bruised. 
But behind the problems are still the makings of a formidable\xc2\xa0company"

Desired output:

"Grocery giant and household name Woolworths is battered and bruised. 
But behind the problems are still the makings of a formidable company"

I'm hoping you're able to explain why this happens and how I can fix. Thanks in advance!


回答1:


\xc2\xa0 means 0xC2 0xA0 is so-called

Non-breaking space

It is a kind of invisible control character in UTF-8 encodings. More info about it check the wikipedia: https://en.wikipedia.org/wiki/Non-breaking_space

I copied what you have pasted in the questions and got the expected output.



来源:https://stackoverflow.com/questions/32419541/python-html-encoding-xc2-xa0

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!