What is the difference between lemmatization vs stemming?

后端 未结 9 2018
无人共我
无人共我 2020-12-07 08:25

When do I use each ?

Also...is the NLTK lemmatization dependent upon Parts of Speech? Wouldn\'t it be more accurate if it was?

9条回答
  •  感动是毒
    2020-12-07 08:58

    Lemmatisation is closely related to stemming. The difference is that a stemmer operates on a single word without knowledge of the context, and therefore cannot discriminate between words which have different meanings depending on part of speech. However, stemmers are typically easier to implement and run faster, and the reduced accuracy may not matter for some applications.

    For instance:

    1. The word "better" has "good" as its lemma. This link is missed by stemming, as it requires a dictionary look-up.

    2. The word "walk" is the base form for word "walking", and hence this is matched in both stemming and lemmatisation.

    3. The word "meeting" can be either the base form of a noun or a form of a verb ("to meet") depending on the context, e.g., "in our last meeting" or "We are meeting again tomorrow". Unlike stemming, lemmatisation can in principle select the appropriate lemma depending on the context.

    Source: https://en.wikipedia.org/wiki/Lemmatisation

提交回复
热议问题