Intersecting two dictionaries in Python

后端 未结 8 2044
借酒劲吻你
借酒劲吻你 2020-11-27 18:26

I am working on a search program over an inverted index. The index itself is a dictionary whose keys are terms and whose values are themselves dictionaries of short document

8条回答
  •  -上瘾入骨i
    2020-11-27 18:38

    A little known fact is that you don't need to construct sets to do this:

    In Python 2:

    In [78]: d1 = {'a': 1, 'b': 2}
    
    In [79]: d2 = {'b': 2, 'c': 3}
    
    In [80]: d1.viewkeys() & d2.viewkeys()
    Out[80]: {'b'}
    

    In Python 3 replace viewkeys with keys; the same applies to viewvalues and viewitems.

    From the documentation of viewitems:

    In [113]: d1.viewitems??
    Type:       builtin_function_or_method
    String Form:
    Docstring:  D.viewitems() -> a set-like object providing a view on D's items
    

    For larger dicts this also slightly faster than constructing sets and then intersecting them:

    In [122]: d1 = {i: rand() for i in range(10000)}
    
    In [123]: d2 = {i: rand() for i in range(10000)}
    
    In [124]: timeit d1.viewkeys() & d2.viewkeys()
    1000 loops, best of 3: 714 µs per loop
    
    In [125]: %%timeit
    s1 = set(d1)
    s2 = set(d2)
    res = s1 & s2
    
    1000 loops, best of 3: 805 µs per loop
    
    For smaller `dict`s `set` construction is faster:
    
    In [126]: d1 = {'a': 1, 'b': 2}
    
    In [127]: d2 = {'b': 2, 'c': 3}
    
    In [128]: timeit d1.viewkeys() & d2.viewkeys()
    1000000 loops, best of 3: 591 ns per loop
    
    In [129]: %%timeit
    s1 = set(d1)
    s2 = set(d2)
    res = s1 & s2
    
    1000000 loops, best of 3: 477 ns per loop
    

    We're comparing nanoseconds here, which may or may not matter to you. In any case, you get back a set, so using viewkeys/keys eliminates a bit of clutter.

提交回复
热议问题