Convert hash.digest() to unicode

只愿长相守 提交于 2019-12-18 14:56:30

问题


import hashlib
string1 = u'test'
hashstring = hashlib.md5()
hashstring.update(string1)
string2 = hashstring.digest()

unicode(string2)

UnicodeDecodeError: 'ascii' codec can't decode byte 0x8f in position 1: ordinal
not in range(128)

The string HAS to be unicode for it to be any use to me, can this be done? Using python 2.7 if that helps...


回答1:


The result of .digest() is a bytestring¹, so converting it to Unicode is pointless. Use .hexdigest() if you want a readable representation.

¹ Some bytestrings can be converted to Unicode, but the bytestrings returned by .digest() do not contain textual data. They can contain any byte including the null byte: they're usually not printable without using escape sequences.




回答2:


Ignacio just gave the perfect answer. Just a complement: when you convert some string from an encoding which has chars not found in ASCII to unicode, you have to pass the encoding as a parameter:

>>> unicode("órgão")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
>>> unicode("órgão", "UTF-8")
u'\xf3rg\xe3o'

If you cannot say what is the original encoding (UTF-8 in my example) you really cannot convert to Unicode. It is a signal that something is not pretty correct in your intentions.

Last but not least, encodings are pretty confusing stuff. This comprehensive text about them can make them clear.



来源:https://stackoverflow.com/questions/6257647/convert-hash-digest-to-unicode

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!