Python: Using .format() on a Unicode-escaped string

前端 未结 3 2135
眼角桃花
眼角桃花 2020-12-04 08:46

I am using Python 2.6.5. My code requires the use of the \"more than or equal to\" sign. Here it goes:

>>> s = u\'\\u2265\'
>>> print s
&         


        
相关标签:
3条回答
  • 2020-12-04 09:09

    Just make the second string also a unicode string

    >>> s = u'\u2265'
    >>> print s
    ≥
    >>> print "{0}".format(s)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    UnicodeEncodeError: 'ascii' codec can't encode character u'\u2265' in position 0: ordinal not in range(128)
    >>> print u"{0}".format(s)
    ≥
    >>> 
    
    0 讨论(0)
  • 2020-12-04 09:15

    A bit more information on why that happens.

    >>> s = u'\u2265'
    >>> print s
    

    works because print automatically uses the system encoding for your environment, which was likely set to UTF-8. (You can check by doing import sys; print sys.stdout.encoding)

    >>> print "{0}".format(s)

    fails because format tries to match the encoding of the type that it is called on (I couldn't find documentation on this, but this is the behavior I've noticed). Since string literals are byte strings encoded as ASCII in python 2, format tries to encode s as ASCII, which then results in that exception. Observe:

    >>> s = u'\u2265'
    >>> s.encode('ascii')
    Traceback (most recent call last):
      File "<input>", line 1, in <module>
    UnicodeEncodeError: 'ascii' codec can't encode character u'\u2265' in position 0: ordinal not in range(128)
    

    So that is basically why these approaches work:

    >>> s = u'\u2265'
    >>> print u'{}'.format(s)
    ≥
    >>> print '{}'.format(s.encode('utf-8'))
    ≥
    

    The source character set is defined by the encoding declaration; it is ASCII if no encoding declaration is given in the source file (https://docs.python.org/2/reference/lexical_analysis.html#string-literals)

    0 讨论(0)
  • 2020-12-04 09:21

    unicodes need unicode format strings.

    >>> print u'{0}'.format(s)
    ≥
    
    0 讨论(0)
提交回复
热议问题