ValueError: unichr() arg not in range(0x10000) (narrow Python build)

后端 未结 3 1123
星月不相逢
星月不相逢 2020-12-05 19:07

I am trying to convert the html entity to unichar, the html entity is 󮠖 when i try to do the following:

unichr(int(976918))
<         


        
3条回答
  •  粉色の甜心
    2020-12-05 19:55

    You can decode a string that has a Unicode escape (\U followed by 8 hex digits, zero-padded) using the "unicode-escape" encoding:

    >>> s = "\\U%08x" % 976918
    >>> s
    '\\U000ee816'
    
    >>> c = s.decode('unicode-escape')
    >>> c
    u'\U000ee816'
    

    On a narrow build it's stored as a UTF-16 surrogate pair:

    >>> list(c)
    [u'\udb7a', u'\udc16']
    

    This surrogate pair is processed correctly as a code unit during encoding:

    >>> c.encode('utf-8')
    '\xf3\xae\xa0\x96'
    
    >>> '\xf3\xae\xa0\x96'.decode('utf-8')
    u'\U000ee816'
    

提交回复
热议问题