What is the best stemming method in Python?

后端 未结 6 1589
被撕碎了的回忆
被撕碎了的回忆 2020-12-12 22:36

I tried all the nltk methods for stemming but it gives me weird results with some words.

Examples

It often cut end of words when it shouldn\'t do it :

6条回答
  •  暗喜
    暗喜 (楼主)
    2020-12-12 22:43

    Stemming is all about removing suffixes(usually only suffixes, as far as I have tried none of the nltk stemmers could remove a prefix, forget about infixes). So we can clearly call stemming as a dumb/ not so intelligent program. It doesn't check if a word has a meaning before or after stemming. For eg. If u try to stem "xqaing", although not a word, it will remove "-ing" and give u "xqa".

    So, in order to use a smarter system, one can use lemmatizers. Lemmatizers uses well-formed lemmas (words) in form of wordnet and dictionaries. So it always returns and takes a proper word. However, it is slow because it goes through all words in order to find the relevant one.

提交回复
热议问题