Removing words from list in python

淺唱寂寞╮ 提交于 2020-06-16 17:01:32

问题


I have a list 'abc' (strings) and I am trying to remove some words present in list 'stop' from the list 'abc' and all the digits present in abc.

abc=[ 'issues in performance 421',
 'how are you doing',
 'hey my name is abc, 143 what is your name',
 'attention pleased',
 'compliance installed 234']
stop=['attention', 'installed']

I am using list comprehension to remove it but this below code is not able to remove that word.

new_word=[word for word in abc if word not in stop ]

Result:(attention word is still present.)

['issues in performance',
 'how are you doing',
 'hey my name is abc, what is your name',
 'attention pleased',
 'compliance installed']

Desired output:

 ['issues in performance',
     'how are you doing',
     'hey my name is abc, what is your name',
     'pleased',
     'compliance']

Thanks


回答1:


Here is a solution, using simple regular expression with the re.sub method. This solution removes numbers as well.

import re

abc=[ 'issues in performance 421',
 'how are you doing',
 'hey my name is abc, 143 what is your name',
 'attention pleased',
 'compliance installed 234']
stop=['attention\s+', 'installed\s+', '[0-9]']

[(lambda x: re.sub(r'|'.join(stop), '', x))(x) for x in abc]


'Output':
['issues in performance ',
'how are you doing',
 'hey my name is abc,  what is your name',
 'pleased',
 'compliance ']



回答2:


You need to split each phrase into words and re-join the words into phrases after filtering out those in stop.

[' '.join(w for w in p.split() if w not in stop) for p in abc]

This outputs:

['issues in performance', 'how are you doing', 'hey my name is abc, what is your name', 'pleased', 'compliance installed']



回答3:


list1 = []
for word in abc:
    word1 = ''
    for remove_word in stop:
        word1 = remove_word
        word1 = word.replace(word1, '')
    list1.append(word1)



回答4:


This is how I'd do it at least:

abc=[ 'issues in performance 421',
    'how are you doing',
    'hey my name is abc, 143 what is your name',
    'attention pleased',
    'compliance installed 234'
]
stop=['attention', 'installed']
for x, elem in enumerate(abc):
    abc[x] = " ".join(filter(lambda x: x not in stop and not x.isdigit(), elem.split()))
print(abc)

result:

['issues in performance',
    'how are you doing',
    'hey my name is abc, what is your name',
    'pleased',
    'compliance']

Hope it helps in any way :)




回答5:


It's just need to use set will good to this question. Because you maybe have more than one word at each item, so you can't use in. you should use set with & to get the public word. If it's exists public word with your stop set will return True. Because you only care about the rest part , so we can use if not here.

new_word=[word for word in abc if  not set(word.split(' ')) & set(stop)]

UPDATE

If you also want to delete all include digit item, you just simple do it with the following :

new_word=[word for word in abc if  not (set(word.split(' ')) & set(stop) or any([i.strip().isdigit() for i in word.split(' ')]))]


来源:https://stackoverflow.com/questions/51317357/removing-words-from-list-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!