MD5 hash with different results

耗尽温柔 提交于 2019-12-28 05:58:56

问题


Im trying to encode some chains to MD5 but I have noticed that:

For the chain: "123456çñ"

Some webs like

http://www.md5.net

www.md5.cz

md5generator.net

return: "66f561bb6b68372213dd9768e55e1002"

And others like:

http://www.adamek.biz/md5-generator.php

7thspace.com/webmaster_tools/online_md5_encoder.html

md5.rednoize.com/

return: "9e6c9a1eeb5e00fbf4a2cd6519e0cfcb"

I'd need to encode the chains with standar md5 because I need to connect my results with other systems. which hash is the correct?

Thanks in advance


回答1:


The problem I guess is in different text encodings. The string you show can't be represented in ANSI encoding - it requires UTF-16 or UTF-8. The choice of one of the latter leads to different byte representation of the string and that produces different hashes.

Remember, MD5 hashes bytes, not characters - it's up to you how to encode those characters as bytes before feeding bytes to MD5. If you want to interoperate with other systems you have to use the same encoding as those systems.




回答2:


Let us use Python to understand this.

>>> '123456çñ'
'123456\xc3\xa7\xc3\xb1'
>>> 'ç'
'\xc3\xa7'
>>> 'ñ'
'\xc3\xb1'

In the above output, we see the UTF-8 encoding of 'ç' and 'ñ'.

>>> md5('123456çñ').digest().encode('hex')
'66f561bb6b68372213dd9768e55e1002'

So, when we compute MD5 hash of the UTF-8 encoded data, we get the first result.

>>> u'ç'
u'\xe7'
>>> u'ñ'
u'\xf1'

Here, we see the Unicode code points of 'ç' and 'ñ'.

>>> md5('123456\xe7\xf1').digest().encode('hex')
'9e6c9a1eeb5e00fbf4a2cd6519e0cfcb'

So, when we compute MD5 hash of the data represented with the Unicode code points of each character in the string (possibly ISO-8859-1 encoded), we get the second result.

So, the first website is computing the hash of the UTF-8 encoded data while the second one is not.




回答3:


If I try :

echo "123456çñ<br />";
echo "utf-8 : ".md5("123456çñ")."<br />";
echo "ISO-8859-1 : ".md5(iconv("UTF-8", "ISO-8859-1","123456çñ"))."<br />";

It gives the result :

123456çñ
utf-8 : 66f561bb6b68372213dd9768e55e1002
ISO-8859-1 : 9e6c9a1eeb5e00fbf4a2cd6519e0cfcb

The first website encode the string in ISO-8859-1 and the second in UTF-8.




回答4:


I would guess that some of these sites are not correctly handling non-ascii characters. If you are using a standard md5 library then you should be OK, as long as you and the system you are connecting to agree on what character encoding you use.

By the way, MD5 is not recommended for use any more. If this is for crypto purposes then you should really be moving to SHA2.



来源:https://stackoverflow.com/questions/6839969/md5-hash-with-different-results

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!