I was going through some python dictionary links and found this.
I can\'t seem to understand what is happening underneath.
dict1 = {1:\'1\',2:\'2\'}
In Python, the keys
of a dict is stored as a hash-slot
pairs, where slot
consists of the key-value
pairs under a certain hash
. So the actual searching procedure of getting value
by key
in a dict is as follows:
hash(key)
,slot
under the hash
value,slot
to find the target key(name it as tkey
) which satisfy tkey == key
, then return the value
of that key
. Therefore in Python, Same keys
could have different values if their hashes
are not the same, while same hashes
could have different values if their keys
are not the same. The hash
value is computed by __hash__
method and whether key
s are same is controlled by __eq__
method (or __cmp__
).
For example,
class A:
def __hash__(self):
return 1
def __eq__(self, other):
return False
Now, all the instances of A
have the same hash value 1
, but all the instances are different (including compared with themselves):
a1 = A()
a2 = A()
print(hash(a1) == hash(a2)) # True
print(a1 == a2) # False
print(a1 == a1) # False
Let's see what they can be when serving as keys
in dict
:
b = {
a1: 1,
a2: 2,
}
print(b)
# {<__main__.A object at 0x000002DDCB505DD8>: 1,
# <__main__.A object at 0x000002DDCB505D30>: 2}
True
and 1
cannot exist simultaneously in one dictIn this question (or most cases in Python), equivalent hash
means equivalent key
.
print(hash(True) == hash(1)) # True
print(True == 1) # True
The result(or say, the reason of this equality mechanism) is that each hash slot
has only one key-value
pair(because keys
are equal). This makes it very fast to search the value since there is no need of iteration over the slot. Still, you can change this equality in your own code to realize multiple same-hash keys in dict.
class exint(int):
def __init__(self, val):
self.val = val
def __eq__(self, other):
return False
def __hash__(self):
return int.__hash__(self.val)
a = exint(1)
print(a) # 1
b = {
a: 1,
True: 2,
}
print(b) # {1: 1, True: 2}
The problem is that True
is a built-in enumeration with a value of 1
. Thus, the hash function sees True
as simply another 1
, and ... well, the two get confused on re-mapping, as you see. Yes, there are firm rules that describe how Python will interpret these, but you probably don't care about anything past False=0 and True=1 at this level.
The label you see (True vs 1, for example) is set at the first reference. For instance:
>>> d = {True:11, 0:10}
>>> d
{0: 10, True: 11}
>>> d[1] = 144
>>> d
{0: 10, True: 144}
>>> d[False] = 100
>>> d
{0: 100, True: 144}
Note how this works: each dictionary entry displays the first label is sees for each given value (0/False and 1/True). As with any assignment, the value displayed is that last one.
A Python dict
is a hash map - it indexes its keys by a hash function for quick lookup in memory. Since evaluation of hash(1) is hash(True)
is True
, Python sees both as pretty much the same key. Thus, you cannot have both 1
and True
in any sort of hash store in Python (without implementing your own hash functions, that is).