I want to remove all types of escape sequences from a list of strings. How can I do this? input:
[\'william\', \'short\', \'\\x80\', \'twitter\', \'\\xaa\',
Something like this?
>>> from ast import literal_eval
>>> s = r'Hello,\nworld!'
>>> print(literal_eval("'%s'" % s))
Hello,
world!
Edit: ok, that's not what you want. What you want can't be done in general, because, as @Sven Marnach explained, strings don't actually contain escape sequences. Those are just notation in string literals.
You can filter all strings with non-ASCII characters from your list with
def is_ascii(s):
try:
s.decode('ascii')
return True
except UnicodeDecodeError:
return False
[s for s in ['william', 'short', '\x80', 'twitter', '\xaa',
'\xe2', 'video', 'guy', 'ray']
if is_ascii(s)]