Convert hash.digest() to unicode

问题

import hashlib
string1 = u'test'
hashstring = hashlib.md5()
hashstring.update(string1)
string2 = hashstring.digest()

unicode(string2)

UnicodeDecodeError: 'ascii' codec can't decode byte 0x8f in position 1: ordinal
not in range(128)

The string HAS to be unicode for it to be any use to me, can this be done? Using python 2.7 if that helps...

回答1:

The result of .digest() is a bytestring¹, so converting it to Unicode is pointless. Use .hexdigest() if you want a readable representation.

¹ Some bytestrings can be converted to Unicode, but the bytestrings returned by .digest() do not contain textual data. They can contain any byte including the null byte: they're usually not printable without using escape sequences.

回答2:

Ignacio just gave the perfect answer. Just a complement: when you convert some string from an encoding which has chars not found in ASCII to unicode, you have to pass the encoding as a parameter:

>>> unicode("órgão")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
>>> unicode("órgão", "UTF-8")
u'\xf3rg\xe3o'

If you cannot say what is the original encoding (UTF-8 in my example) you really cannot convert to Unicode. It is a signal that something is not pretty correct in your intentions.

Last but not least, encodings are pretty confusing stuff. This comprehensive text about them can make them clear.

来源：https://stackoverflow.com/questions/6257647/convert-hash-digest-to-unicode

标签

python

unicode

unicode-string