Search a list of strings for any sub-string from another list

前端 未结 5 391
挽巷
挽巷 2020-12-29 06:05

Given these 3 lists of data and a list of keywords:

good_data1 = [\'hello, world\', \'hey, world\']
good_data2 = [\'hey, man\', \'whats up\']
bad_data = [\'h         


        
5条回答
  •  庸人自扰
    2020-12-29 06:58

    In your example, with so few items, it doesn't really matter. But if you have a list of several thousand items, this might help.

    Since you don't care which element in the list contains the keyword, you can scan the whole list once (as one string) instead of one item at the time. For that you need a join character that you know won't occur in the keyword, in order to avoid false positives. I use the newline in this example.

    def check_data(data):
        s = "\n".join(data);
        for k in keywords:
            if k in s:
                return True
    
        return False
    

    In my completely unscientific test, my version checked a list of 5000 items 100000 times in about 30 seconds. I stopped your version after 3 minutes -- got tired of waiting to post =)

提交回复
热议问题