问题
import hashlib
string1 = u'test'
hashstring = hashlib.md5()
hashstring.update(string1)
string2 = hashstring.digest()
unicode(string2)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8f in position 1: ordinal
not in range(128)
The string HAS to be unicode for it to be any use to me, can this be done? Using python 2.7 if that helps...
回答1:
The result of .digest()
is a bytestring¹, so converting it to Unicode is pointless. Use .hexdigest()
if you want a readable representation.
¹ Some bytestrings can be converted to Unicode, but the bytestrings returned by .digest()
do not contain textual data. They can contain any byte including the null byte: they're usually not printable without using escape sequences.
回答2:
Ignacio just gave the perfect answer. Just a complement: when you convert some string from an encoding which has chars not found in ASCII to unicode, you have to pass the encoding as a parameter:
>>> unicode("órgão")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
>>> unicode("órgão", "UTF-8")
u'\xf3rg\xe3o'
If you cannot say what is the original encoding (UTF-8 in my example) you really cannot convert to Unicode. It is a signal that something is not pretty correct in your intentions.
Last but not least, encodings are pretty confusing stuff. This comprehensive text about them can make them clear.
来源:https://stackoverflow.com/questions/6257647/convert-hash-digest-to-unicode