how to print chinese word in my code.. using python

前端 未结 6 1264
别那么骄傲
别那么骄傲 2020-11-30 05:22

This is my code:

print \'哈哈\'.decode(\'gb2312\').encode(\'utf-8\')

...and it prints:

SyntaxError: Non-ASCII character \'\\x         


        
6条回答
  •  时光取名叫无心
    2020-11-30 05:41

    You can't do encode on unicode character. Encode is used to translate all character encoded in unicode to other code style. It can't be used to unicode character.

    In the controversy way, decode can only used to character not encoded in unicode to translate to unicode character.

    If you declare a string with 'u' character before the string, you will get a string encoded in unicode. You can use isinstance(str, unicode) to detect if the str is encoded in unicode.

    Try this code below. Hint: in Windows with Chinese version, default code style is "gbk".

    >>> a = '哈哈'
    >>> b = u'哈哈'
    >>> isinstance(a,unicode)
    False
    >>> isinstance(b,unicode)
    True

    >>> a
    '\xb9\xfe\xb9\xfe'
    >>> b
    u'\u54c8\u54c8'

    >>> a.decode('gbk')
    u'\u54c8\u54c8'
    >>> a_unicode = a.decode('gbk')
    >>> a_unicode
    u'\u54c8\u54c8'

    >>> print a_unicode
    哈哈
    >>> a_unicode.encode('gbk') == a
    True
    >>> a_unicode == b
    True

    >>> a.encode('gbk')
    Traceback (most recent call last): File "", line 1, in UnicodeDecodeError: 'ascii' codec can't decode byte 0xb9 in position 0: ordinal not in range(128)

    >>> b.decode('gbk')
    Traceback (most recent call last): File "", line 1, in UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

提交回复
热议问题