ElementTree and unicode

前端 未结 6 805
星月不相逢
星月不相逢 2020-12-06 00:38

I have this char in an xml file:


  
      fumè
  

I t

6条回答
  •  囚心锁ツ
    2020-12-06 00:57

    You do not need to decode XML for ElementTree to work. XML carries it's own encoding information (defaulting to UTF-8) and ElementTree does the work for you, outputting unicode:

    >>> data = '''\
    ... 
    ...   
    ...       fumè
    ...   
    ... 
    ... '''
    >>> x = ElementTree.fromstring(data)
    >>> x[0][0].text
    u'fum\xe8'
    

    If your data is contained in a file(like) object, just pass the filename or file object directly to the ElementTree.parse() function:

    x = ElementTree.parse('file.xml')
    

提交回复
热议问题