Better way to remove multiple words from a string?

我怕爱的太早我们不能终老 提交于 2020-03-13 06:33:29

问题


bannedWord = ['Good','Bad','Ugly']

def RemoveBannedWords(toPrint,database):
    statement = toPrint
    for x in range(0,len(database)):
        if bannedWord[x] in statement:
            statement = statement.replace(bannedWord[x]+' ','')
    return statement

toPrint = 'Hello Ugly Guy, Good To See You.'

print RemoveBannedWords(toPrint,bannedWord)

The output is Hello Guy, To See You. Knowing Python I feel like there is a better way to implement changing several words in a string. I searched up some similar solutions using dictionaries but it didn't seem to fit this situation.


回答1:


Here's a solution with regex:

import re

def RemoveBannedWords(toPrint,database):
    statement = toPrint
    pattern = re.compile("\\b(Good|Bad|Ugly)\\W", re.I)
    return pattern.sub("", toPrint)

toPrint = 'Hello Ugly Guy, Good To See You.'

print RemoveBannedWords(toPrint,bannedWord)



回答2:


I use

bannedWord = ['Good','Bad','Ugly']
toPrint = 'Hello Ugly Guy, Good To See You.'
print ' '.join(i for i in toPrint.split() if i not in bannedWord)



回答3:


Slight variation on Ajay's code, when one of the string is a substring of other in the bannedWord list

bannedWord = ['good', 'bad', 'good guy' 'ugly']

The result of toPrint ='good winter good guy' would be

RemoveBannedWords(toPrint,database = bannedWord) = 'winter good'

as it will remove good first. A sorting is required wrt length of elements in the list.

import re

def RemoveBannedWords(toPrint,database):
    statement = toPrint
    database_1 = sorted(list(database), key=len)
    pattern = re.compile(r"\b(" + "|".join(database_1) + ")\\W", re.I)
    return pattern.sub("", toPrint + ' ')[:-1] #added because it skipped last word

toPrint = 'good winter good guy.'

print(RemoveBannedWords(toPrint,bannedWord))



回答4:


Yet another variation on a theme. If you are going to be calling this a lot, then it is best to compile the regex once to improve the speed:

import re

bannedWord = ['Good','Bad','Ugly']
re_banned_words = re.compile(r"\b(" + "|".join(bannedWord) + ")\\W", re.I)

def RemoveBannedWords(toPrint):
    global re_banned_words
    return re_banned_words.sub("", toPrint)

toPrint = 'Hello Ugly Guy, Good To See You.'
print RemoveBannedWords(toPrint)


来源:https://stackoverflow.com/questions/31273642/better-way-to-remove-multiple-words-from-a-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!