Most efficient way for a lookup/search in a huge list (python)

前端 未结 3 789
深忆病人
深忆病人 2020-12-01 01:51

-- I just parsed a big file and I created a list containing 42.000 strings/words. I want to query [against this list] to check if a given word/string belongs to it. So my qu

3条回答
  •  失恋的感觉
    2020-12-01 02:50

    Using this program it looks like dicts are the fastes, set second, list with bi_contains third:

    from datetime import datetime
    
    def ReadWordList():
        """ Loop through each line in english.txt and add it to the list in uppercase.
    
        Returns:
        Returns array with all the words in english.txt.
    
        """
        l_words = []
        with open(r'c:\english.txt', 'r') as f_in:
            for line in f_in:
                line = line.strip().upper()
                l_words.append(line)
    
        return l_words
    
    # Loop through each line in english.txt and add it to the l_words list in uppercase.
    l_words = ReadWordList()
    l_words = {key: None for key in l_words}
    #l_words = set(l_words)
    #l_words = tuple(l_words)
    
    t1 = datetime.now()
    
    for i in range(10000):
        #w = 'ZEBRA' in l_words
        w = bi_contains(l_words, 'ZEBRA')
    
    t2 = datetime.now()
    print('After: ' + str(t2 - t1))
    
    # list = 41.025293 seconds
    # dict = 0.001488 seconds
    # set = 0.001499 seconds
    # tuple = 38.975805 seconds
    # list with bi_contains = 0.014000 seconds
    

提交回复
热议问题