What is a good way to remove all characters that are out of the range: ordinal(128)
from a string in python?
I\'m using hashlib.sha256 in python 2.7. I\
This is an example of where the changes in python3 will make an improvement, or at least generate a clearer error message
Python2
>>> import hashlib
>>> funky_string=u"You owe me £100"
>>> hashlib.sha256(funky_string)
Traceback (most recent call last):
File "", line 1, in
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 11: ordinal not in range(128)
>>> hashlib.sha256(funky_string.encode("utf-8")).hexdigest()
'81ebd729153b49aea50f4f510972441b350a802fea19d67da4792b025ab6e68e'
>>>
Python3
>>> import hashlib
>>> funky_string="You owe me £100"
>>> hashlib.sha256(funky_string)
Traceback (most recent call last):
File "", line 1, in
TypeError: Unicode-objects must be encoded before hashing
>>> hashlib.sha256(funky_string.encode("utf-8")).hexdigest()
'81ebd729153b49aea50f4f510972441b350a802fea19d67da4792b025ab6e68e'
>>>
The real problem is that sha256
takes a sequence of bytes which python2 doesn't have a clear concept of. Use .encode("utf-8")
is what I'd suggest.