What happens when you call `if key in dict`

前端 未结 3 892
旧时难觅i
旧时难觅i 2020-12-11 06:05

I have a class (let\'s call it myClass) that implements both __hash__ and __eq__. I also have a dict that maps myCl

相关标签:
3条回答
  • 2020-12-11 06:25

    __hash__ will always be called; __eq__ will be called if the object is indeed in the dictionary, or if another object with the same hash is in the dictionary. The hash value is used to narrow down the choice of possible keys. The keys are grouped into "buckets" by hash value, but for lookup Python still has to check each key in the bucket for equality with the lookup key. See http://wiki.python.org/moin/DictionaryKeys . Look at these examples:

    >>> class Foo(object):
    ...     def __init__(self, x):
    ...         self.x = x
    ...     
    ...     def __hash__(self):
    ...         print "Hash"
    ...         return hash(self.x)
    ... 
    ...     def __eq__(self, other):
    ...         print "Eq"
    ...         return self.x == other.x
    >>> Foo(1) in d
    Hash
    Eq
    10: True
    >>> Foo(2) in d
    Hash
    Eq
    11: True
    >>> Foo(3) in d
    Hash
    Eq
    12: True
    >>> Foo(4) in d
    Hash
    13: False
    

    In that example, you can see __hash__ is always called. __eq__ is called once for each lookup when the object is in the dict, because they all have distinct hash values, so one equality check is enough to verify that the object with that hash value is indeed the one being queried. __eq__ is not called in the last case, because none of the objects in the dict have the same hash value as Foo(4), so Python doesn't need to continue with the __eq__.

    >>> class Foo(object):
    ...     def __init__(self, x):
    ...         self.x = x
    ...     
    ...     def __hash__(self):
    ...         print "Hash"
    ...         return 1
    ... 
    ...     def __eq__(self, other):
    ...         print "Eq"
    ...         return self.x == other.x
    >>> d = {Foo(1): 2, Foo(2): 3, Foo(3): 4}
    Hash
    Hash
    Eq
    Hash
    Eq
    Eq
    >>> Foo(1) in d
    Hash
    Eq
    18: True
    >>> Foo(2) in d
    Hash
    Eq
    Eq
    19: True
    >>> Foo(3) in d
    Hash
    Eq
    Eq
    Eq
    20: True
    >>> Foo(4) in d
    Hash
    Eq
    Eq
    Eq
    21: False
    

    In this version, all objects have the same hash value. In this case __eq__ is always called, sometimes multiple times, because the hash doesn't distinguish between the values, so Python needs to explicitly check equality against all values in the dict until it finds an equal one (or finds that none of them equal the one it's looking for). Sometimes it finds it on the first try (Foo(1) in dict above), sometimes it has to check all the values.

    0 讨论(0)
  • 2020-12-11 06:27

    __hash__ defines the bucket the object is put into, __eq__ gets called only when objects are in the same bucket.

    0 讨论(0)
  • 2020-12-11 06:34

    First, __hash__(myNewMyClassObj) gets called. If no object with the same hash is found in the dictionary, Python assumes myNewMyClassObj is not in the dictionary. (Note that Python requires that whenever __eq__ evaluates as equal for two objects, their __hash__ must be identical.)

    If some objects with the same __hash__ are found in the dictionary, __eq__ gets called on each of them. If __eq__ evaluates as equal for any of them, the myNewMyClassObj in dict_ returns True.

    Thus, you just need to make sure both __eq__ and __hash__ are fast.

    To your follow up question: yes, dict_ stores only one of a set of equivalent MyClass objects (as defined by __eq__). (As does set.)

    Note that __eq__ is only called on the objects that had the same hash and got allocated to the same bucket. The number of such objects is usually a very small number (dict implementation makes sure of that). So you still have (roughly) O(1) lookup performance.

    0 讨论(0)
提交回复
热议问题