Determining Hypernym or Hyponym using wordnet nltk

笑着哭i 提交于 2019-11-30 07:31:34
alvas

Firstly, there is a difference between word and synset/concept in wordnet.

Here we see that one word can have multiple meaning (i.e. links to multiple concepts):

>>> from nltk.corpus import wordnet as wn
>>> car = 'car'
>>> auto = 'automobile'
>>> wn.synsets(auto)
[Synset('car.n.01'), Synset('automobile.v.01')]
>>> wn.synsets(car)
[Synset('car.n.01'), Synset('car.n.02'), Synset('car.n.03'), Synset('car.n.04'), Synset('cable_car.n.01')]

And in this case 'automobile' and 'car' can refer to the same Synset('car.n.01') and if so, then they have no hypo/hypernym relationship.

There's also the notion of lemma which will just complicate things, so we'll skip that for now.

Let's say you are not comparing words but synsets, then you can simply find all hyponyms of the synset and see whether the other synset occurs inside it.

If you're comparing plain words, see How to get all the hyponyms of a word/synset in python nltk and wordnet?

The below will show how to compare synsets. For example sake, i'll use 'fruit' and 'apple' which is more logical than 'automobile' and 'car' since there is only one noun synset for 'automobile' and 'car'

>>> from nltk.corpus import wordnet as wn
>>>
>>> fruit = 'fruit'
>>> wn.synsets(fruit)
[Synset('fruit.n.01'), Synset('yield.n.03'), Synset('fruit.n.03'), Synset('fruit.v.01'), Synset('fruit.v.02')]
>>> wn.synsets(fruit)[0].definition()
u'the ripened reproductive body of a seed plant'
>>> fruit = wn.synsets(fruit)[0]
>>> 
>>> apple = 'apple'
>>> wn.synsets(apple)
[Synset('apple.n.01'), Synset('apple.n.02')]
>>> wn.synsets(apple)[0].definition()
u'fruit with red or yellow or green skin and sweet to tart crisp whitish flesh'
>>> apple = wn.synsets(apple)[0]
>>>

Below, we see that apple is not in fruit's direct hyponyms:

>>> fruit.hyponyms()
[Synset('accessory_fruit.n.01'), Synset('achene.n.01'), Synset('acorn.n.01'), Synset('aggregate_fruit.n.01'), Synset('berry.n.02'), Synset('buckthorn_berry.n.01'), Synset('buffalo_nut.n.01'), Synset('chokecherry.n.01'), Synset('cubeb.n.01'), Synset('drupe.n.01'), Synset('ear.n.05'), Synset('edible_fruit.n.01'), Synset('fruitlet.n.01'), Synset('gourd.n.02'), Synset('hagberry.n.01'), Synset('hip.n.05'), Synset('juniper_berry.n.01'), Synset('marasca.n.01'), Synset('may_apple.n.01'), Synset('olive.n.01'), Synset('pod.n.02'), Synset('pome.n.01'), Synset('prairie_gourd.n.01'), Synset('pyxidium.n.01'), Synset('quandong.n.02'), Synset('rowanberry.n.01'), Synset('schizocarp.n.01'), Synset('seed.n.01'), Synset('wild_cherry.n.01')]
>>> 
>>> apple in fruit.hyponyms()
False

So we have to iterate down all the hyponyms and see whether apple is in one of them:

>>> hypofruits = set([i for i in fruit.closure(lambda s:s.hyponyms())])
>>> apple in hypofruits
True

There you have it! For the sake of completeness:

>>> hyperapple = set([i for i in apple.closure(lambda s:s.hypernyms())])
>>> fruit in hyperapple
True
>>> hypoapple = set([i for i in apple.closure(lambda s:s.hyponyms())])
>>> fruit in hypoapple
False
>>> hyperfruit = set([i for i in fruit.closure(lambda s:s.hypernyms())])
>>> apple in hyperfruit
False
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!