remove escape character from string [closed]

不羁岁月 提交于 2019-12-07 00:39:20

问题


I would like to turn this string:

a = '\\a'

into this one

b = '\a'

It doesn't seem like there is an obvious way to do this with replace?

EDIT: To be more precise, I want to change the escaping of the backslash to escape the character a


回答1:


The character '\a' is the ASCII BEL character, chr(7).

To do the conversion in Python 2:

from __future__ import print_function
a = '\\a'
c = a.decode('string-escape')
print(repr(a), repr(c))

output

'\\a' '\x07'

And for future reference, in Python 3:

a = '\\a'
b = bytes(a, encoding='ascii')
c = b.decode('unicode-escape')
print(repr(a), repr(c))

This gives identical output to the above snippet.

In Python 3, if you were working with bytes objects you'd do something like this:

a = b'\\a'
c = bytes(a.decode('unicode-escape'), 'ascii')
print(repr(a), repr(c))

output

b'\\a' b'\x07'

As Antti Haapala mentions, this simple strategy for Python 3 won't work if the source string contains unicode characters too. In tha case, please see his answer for a more robust solution.




回答2:


On Python 2 you can use

>>> '\\a'.decode('string_escape')
'\x07'

Note how \a is repr'd as \x07.

If the string is a unicode string with also extended characters, you need to decode it to a bytestring first, otherwise the default encoding (ascii!) is used to convert the unicode object to a bytestring first.


However, this codec doesn't exist in Python 3, and things are very much more complicated. You can use the unicode-escape to decode but it is very broken if the source string contains unicode characters too:

>>> '\aäầ'.encode().decode('unicode_escape')
'\x07äầ'

The resulting string doesn't consist of Unicode characters but bytes decoded as latin-1. The solution is to re-encode to latin-1 and then decode as utf8 again:

>>> '\\aäầ\u1234'.encode().decode('unicode_escape').encode('latin1').decode()
'\x07äầሴ'



回答3:


Unescape string is what I searched for to find this:

>>> a = r'\a'
>>> a.encode().decode('unicode-escape')
'\x07'
>>> '\a'
'\x07'

That's the way to do it with unicode. Since you're in Python 2 and may not be using unicode, you may actually one:

>>> a.decode('string-escape')
'\x07'


来源:https://stackoverflow.com/questions/40452956/remove-escape-character-from-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!