Im trying to encode some chains to MD5 but I have noticed that:
For the chain: \"123456çñ\"
Some webs like
http://www.md5.net
www.md5.cz
Let us use Python to understand this.
>>> '123456çñ'
'123456\xc3\xa7\xc3\xb1'
>>> 'ç'
'\xc3\xa7'
>>> 'ñ'
'\xc3\xb1'
In the above output, we see the UTF-8 encoding of 'ç' and 'ñ'.
>>> md5('123456çñ').digest().encode('hex')
'66f561bb6b68372213dd9768e55e1002'
So, when we compute MD5 hash of the UTF-8 encoded data, we get the first result.
>>> u'ç'
u'\xe7'
>>> u'ñ'
u'\xf1'
Here, we see the Unicode code points of 'ç' and 'ñ'.
>>> md5('123456\xe7\xf1').digest().encode('hex')
'9e6c9a1eeb5e00fbf4a2cd6519e0cfcb'
So, when we compute MD5 hash of the data represented with the Unicode code points of each character in the string (possibly ISO-8859-1 encoded), we get the second result.
So, the first website is computing the hash of the UTF-8 encoded data while the second one is not.