with NLTK, How can I generate different form of word, when a certain word is given?

别来无恙 提交于 2021-02-16 14:39:06

问题


For example, Suppose the word "happy" is given, I want to generate other forms of happy such as happiness, happily... etc.

I have read some other previous questions on Stackoverflow and NLTK references. However, there are only POS tagging, morph just like identifying the grammatical form of certain words within sentences, not generating a list of different words. Is there anyone who bumped into similar issues? Thank you.


回答1:


This type of information is included in the Lemma class of NLTK's WordNet implementation. Specifically, it's found in Lemma.derivationally_related_forms().

Here's an example script for finding all possible derivation forms of "happy":

from nltk.corpus import wordnet as wn

forms = set() #We'll store the derivational forms in a set to eliminate duplicates
for happy_lemma in wn.lemmas("happy"): #for each "happy" lemma in WordNet
    forms.add(happy_lemma.name()) #add the lemma itself
    for related_lemma in happy_lemma.derivationally_related_forms(): #for each related lemma
        forms.add(related_lemma.name()) #add the related lemma

Unfortunately, the information in WordNet is not complete. The above script finds "happy" and "happiness" but it fails to find "happily", even though there are multiple "happily" lemmas.



来源:https://stackoverflow.com/questions/45145020/with-nltk-how-can-i-generate-different-form-of-word-when-a-certain-word-is-giv

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!