Profanity Filter using a Regular [removed]list of 100 words)

前端 未结 2 1739
孤街浪徒
孤街浪徒 2020-12-31 17:53

What is the correct way to strip profane words from a string given:
1) I have a list of 100 words to look for in an array of strings. 2) What is the correct way to handl

2条回答
  •  攒了一身酷
    2020-12-31 18:32

    1. Concatenate each word into a list of words - (foobar|foobaz|...)
    2. Then put guards on either side of the grouping for extraneous characters

      [^!@#$%^&*]*(foobar|foobaz|foofii)[^!@#$%^&*]*

    Also, you'll probably want to use a case insensitive flag so that it'll also match words like FooBaz and fOObaR.

    As far as performance goes, concatenating this as one big regex is probably fastest (although I'm not an expert). The regex algorithm is pretty efficient at searching & handling branch conditions. Basically, it must be better than O(mn) (where m is the number of words and n is the size of the text you're searching)

提交回复
热议问题