发表新帖

发表新帖

(unicode error) 'unicodeescape' codec can't decode bytes - string with '\u'

后端未结

关注

 4  818

一个人的身影 2020-12-31 00:00

Writing my code for Python 2.6, but with Python 3 in mind, I thought it was a good idea to put

from __future__ import unicode_literals

at t

4条回答

不思量自难忘° (楼主)

2020-12-31 00:24
AFAIK, all that from __future__ import unicode_literals does is to make all string literals of unicode type, instead of string type. That is:
```
>>> type('')

>>> from __future__ import unicode_literals
>>> type('')
```
But str and unicode are still different types, and they behave just like before.
```
>>> type(str(''))
```
Always, is of str type.

About your r'\u' issue, it is by design, as it is equivalent to ru'\u' without unicode_literals. From the docs:

When an 'r' or 'R' prefix is used in conjunction with a 'u' or 'U' prefix, then the \uXXXX and \UXXXXXXXX escape sequences are processed while all other backslashes are left in the string.

Probably from the way the lexical analyzer worked in the python2 series. In python3 it works as you (and I) would expect.

You can type the backslash twice, and then the \u will not be interpreted, but you'll get two backslashes!

Backslashes can be escaped with a preceding backslash; however, both remain in the string
```
>>> ur'\\u'
u'\\\\u'
```
So IMHO, you have two simple options:
- Do not use raw strings, and escape your backslashes (compatible with python3):
  
  'H:\\unittests'
- Be too smart and take advantage of unicode codepoints (not compatible with python3):
  
  r'H:\u005cunittests'
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题