Using Arabic WordNet for synonyms in python?

纵饮孤独 提交于 2019-12-18 13:38:38

问题


I am trying to get the synonyms for arabic words in a sentence

If the word is in English it works perfectly, and the results are displayed in Arabic language, I was wondering if its possible to get the synonym of an Arabic word right away without writing it in english first.

I tried that but it didn't work & I would prefer without tashkeel انتظار instead of اِنْتِظار

from nltk.corpus import wordnet as omw
jan = omw.synsets('انتظار ')[0]
print(jan)
print(jan.lemma_names(lang='arb'))

回答1:


Wordnet used in nltk doesnt support arabic. If you are looking for Arabic Wordnet so this is a totally different thing.

For Arabic wordnet, download:

  • http://nlp.lsi.upc.edu/awn/get_bd.php
  • http://nlp.lsi.upc.edu/awn/AWNDatabaseManagement.py.gz

You run it with:

$ python AWNDatabaseManagement.py -i upc_db.xml

Now to get something like wn.synset('إنتظار'). Arabic Wordnet has a function wn.get_synsets_from_word(word), but it gives offsets. Also it accepts the words only as vocalized in the database. For example, you should use جَمِيل for جميل:

>> wn.get_synsets_from_word(u"جَمِيل")
[(u'a', u'300218842')]

300218842 is the offset of the synset of جميل .

I checked for the word إنتظار and seems it doesn't exist in AWN.

More details about using AWN to get synonyms here.



来源:https://stackoverflow.com/questions/34620627/using-arabic-wordnet-for-synonyms-in-python

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!