Does python's hash function remain identical across different versions?

回眸只為那壹抹淺笑 提交于 2019-12-08 20:01:55

问题


I'm currently using hash on tuples of integers and strings (and nested tuples of integers and strings etc.) in order to compute the uniqueness of some objects. Barring that there might be a hash collisions, I wonder - is the hash function on those data types guaranteed to return the same result for different versions of Python?


回答1:


No. Apart from long-standing differences between 32- and 64-bit versions of Python, the hashing algorithm was changed in Python 3.3 to resolve a security issue:

By default, the hash() values of str, bytes and datetime objects are “salted” with an unpredictable random value. Although they remain constant within an individual Python process, they are not predictable between repeated invocations of Python.

This is intended to provide protection against a denial-of-service caused by carefully-chosen inputs that exploit the worst case performance of a dict insertion, O(n^2) complexity. See http://www.ocert.org/advisories/ocert-2011-003.html for details.

Changing hash values affects the iteration order of dicts, sets and other mappings. Python has never made guarantees about this ordering (and it typically varies between 32-bit and 64-bit builds).

As a result, from 3.3 onwards hash() is not even guaranteed to return the same result across different invocations of the same Python version.




回答2:


I'm not sure what you are looking for, but you can always use hashlib if you're looking for consistent hashing.

>>> import hashlib
>>> t = ("values", "other")
>>> hashlib.sha256(str(t)).hexdigest()
'bc3ed71325acf1386b40aa762b661bb63bb72e6df9457b838a2ea93c95cc8f0c'

OR:

>>> h = hashlib.sha256()
>>> for item in t:
...     h.update(item)
...
>>> h.hexdigest()
'5e98df135627bc8d98250ca7e638aeb2ccf7981ce50ee16ce00d4f23efada068'



回答3:


No. eg.

32 bit

Python 2.7.3 (default, Aug  1 2012, 05:16:07) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> hash("foobar")
-1969371895

64 bit

Python 2.7.3 (default, Aug  1 2012, 05:14:39) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> hash("foobar")
3433925302934160649


来源:https://stackoverflow.com/questions/16452252/does-pythons-hash-function-remain-identical-across-different-versions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!