Intersecting two dictionaries in Python

后端 未结 8 2064
借酒劲吻你
借酒劲吻你 2020-11-27 18:26

I am working on a search program over an inverted index. The index itself is a dictionary whose keys are terms and whose values are themselves dictionaries of short document

8条回答
  •  生来不讨喜
    2020-11-27 18:50

    def two_keys(term_a, term_b, index):
        doc_ids = set(index[term_a].keys()) & set(index[term_b].keys())
        doc_store = index[term_a] # index[term_b] would work also
        return {doc_id: doc_store[doc_id] for doc_id in doc_ids}
    
    def n_keys(terms, index):
        doc_ids = set.intersection(*[set(index[term].keys()) for term in terms])
        doc_store = index[term[0]]
        return {doc_id: doc_store[doc_id] for doc_id in doc_ids}
    
    In [0]: index = {'a': {1: 'a b'}, 
                     'b': {1: 'a b'}}
    
    In [1]: two_keys('a','b', index)
    Out[1]: {1: 'a b'}
    
    In [2]: n_keys(['a','b'], index)
    Out[2]: {1: 'a b'}
    

    I would recommend changing your index from

    index = {term: {doc_id: doc}}
    

    to two indexes one for the terms and then a separate index to hold the values

    term_index = {term: set([doc_id])}
    doc_store = {doc_id: doc}
    

    that way you don't store multiple copies of the same data

提交回复
热议问题