What is the difference between lemmatization vs stemming?

后端 未结 9 1999
无人共我
无人共我 2020-12-07 08:25

When do I use each ?

Also...is the NLTK lemmatization dependent upon Parts of Speech? Wouldn\'t it be more accurate if it was?

9条回答
  •  遥遥无期
    2020-12-07 08:55

    An example-driven explanation on the differenes between lemmatization and stemming:

    Lemmatization handles matching “car” to “cars” along with matching “car” to “automobile”.

    Stemming handles matching “car” to “cars” .

    Lemmatization implies a broader scope of fuzzy word matching that is still handled by the same subsystems. It implies certain techniques for low level processing within the engine, and may also reflect an engineering preference for terminology.

    [...] Taking FAST as an example, their lemmatization engine handles not only basic word variations like singular vs. plural, but also thesaurus operators like having “hot” match “warm”.

    This is not to say that other engines don’t handle synonyms, of course they do, but the low level implementation may be in a different subsystem than those that handle base stemming.

    http://www.ideaeng.com/stemming-lemmatization-0601

提交回复
热议问题