How do you implement a good profanity filter?

后端 未结 21 2761
误落风尘
误落风尘 2020-11-22 04:27

Many of us need to deal with user input, search queries, and situations where the input text can potentially contain profanity or undesirable language. Oftentimes this needs

21条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2020-11-22 05:26

    Frankly, I'd let them get the "trick the system" words out and ban them instead, which is just me. But it also makes the programming simpler.

    What I'd do is implement a regex filter like so: /[\s]dooby (doo?)[\s]/i or it the word is prefixed on others, /[\s]doob(er|ed|est)[\s]/. These would prevent filtering words like assuaged, which is perfectly valid, but would also require knowledge of the other variants and updating the actual filter if you learn a new one. Obviously these are all examples, but you'd have to decide how to do it yourself.

    I'm not about to type out all the words I know, not when I don't actually want to know them.

提交回复
热议问题