Remove uni-grams from a list of bi-grams

前端 未结 2 554
北海茫月
北海茫月 2021-01-22 00:26

I have managed to create 2 lists from text documents. The first is my bi-gram list:

keywords = [\'nike shoes\',\'nike clothing\', \'nike black\', \'nike white\']         


        
2条回答
  •  半阙折子戏
    2021-01-22 00:35

    assuming you have the 2 lists this will do what you want:

    new_keywords = []
    
    for k in keywords:
        temp = False
    
        for s in stops:
            if s in k:
               new_keywords.append(k.replace(s,""))
               temp = True
    
        if temp == False:
            new_keywords.append(k)
    

    This will create a list like you posted:

    ['nike shoes', 'nike ', 'nike ', 'nike ']
    

    To eliminate the doubles do this:

    new_keywords = list(set(new_keywords))
    

    So the final list looks like this:

    ['nike shoes', 'nike ']
    

    enter image description here

提交回复
热议问题