Porter Stemming of fried

送分小仙女□ 提交于 2019-12-24 02:23:03

问题


Why does the porter stemming algorithm online at

http://text-processing.com/demo/stem/

stem fried to fri and not fry?

I can't recall any words ending with ied past tense in English that have a nominative form ending with i.

Is this a bug?


回答1:


A stem as returned by Porter Stemmer is not necessarily the base form of a verb, or a valid word at all. If you're looking for that, you need to look for a lemmatizer instead.




回答2:


Firstly, a stemmer is not a lemmatizer, see also Stemmers vs Lemmatizers:

>>> from nltk.stem import PorterStemmer, WordNetLemmatizer
>>> porter = PorterStemmer()
>>> wnl = WordNetLemmatizer()
>>> fried = 'fried'
>>> porter.stem(fried)
u'fri'
>>> wnl.lemmatize(fried)
'fried'

Next, a lemmatizer is Part-Of-Speech (POS) sensitive:

>>> wnl.lemmatize(fried, pos='v')
u'fry'


来源:https://stackoverflow.com/questions/27659179/porter-stemming-of-fried

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!