How is set() implemented?

后端 未结 6 1740
旧时难觅i
旧时难觅i 2020-11-22 15:07

I\'ve seen people say that set objects in python have O(1) membership-checking. How are they implemented internally to allow this? What sort of data structure d

6条回答
  •  日久生厌
    2020-11-22 15:52

    We all have easy access to the source, where the comment preceding set_lookkey() says:

    /* set object implementation
     Written and maintained by Raymond D. Hettinger 
     Derived from Lib/sets.py and Objects/dictobject.c.
     The basic lookup function used by all operations.
     This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
     The initial probe index is computed as hash mod the table size.
     Subsequent probe indices are computed as explained in Objects/dictobject.c.
     To improve cache locality, each probe inspects a series of consecutive
     nearby entries before moving on to probes elsewhere in memory.  This leaves
     us with a hybrid of linear probing and open addressing.  The linear probing
     reduces the cost of hash collisions because consecutive memory accesses
     tend to be much cheaper than scattered probes.  After LINEAR_PROBES steps,
     we then use open addressing with the upper bits from the hash value.  This
     helps break-up long chains of collisions.
     All arithmetic on hash should ignore overflow.
     Unlike the dictionary implementation, the lookkey function can return
     NULL if the rich comparison returns an error.
    */
    
    
    ...
    #ifndef LINEAR_PROBES
    #define LINEAR_PROBES 9
    #endif
    
    /* This must be >= 1 */
    #define PERTURB_SHIFT 5
    
    static setentry *
    set_lookkey(PySetObject *so, PyObject *key, Py_hash_t hash)  
    {
    ...
    

提交回复
热议问题