ElementTree and unicode

前端 未结 6 808
星月不相逢
星月不相逢 2020-12-06 00:38

I have this char in an xml file:


  
      fumè
  

I t

6条回答
  •  渐次进展
    2020-12-06 01:05

    You need to decode utf-8 strings into a unicode object. So

    string_data.encode('utf-8')
    

    should be

    string_data.decode('utf-8')
    

    assuming string_data is actually an utf-8 string.

    So to summarize: To get an utf-8 string from a unicode object you encode the unicode (using the utf-8 encoding), and to turn a string to a unicode object you decode the string using the respective encoding.

    For more details on the concepts I suggest reading The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (not Python specific).

提交回复
热议问题