问题
I am trying to get the synonyms for arabic words in a sentence
If the word is in English it works perfectly, and the results are displayed in Arabic language, I was wondering if its possible to get the synonym of an Arabic word right away without writing it in english first.
I tried that but it didn't work & I would prefer without tashkeel انتظار instead of اِنْتِظار
from nltk.corpus import wordnet as omw
jan = omw.synsets('انتظار ')[0]
print(jan)
print(jan.lemma_names(lang='arb'))
回答1:
Wordnet used in nltk doesnt support arabic. If you are looking for Arabic Wordnet so this is a totally different thing.
For Arabic wordnet, download:
- http://nlp.lsi.upc.edu/awn/get_bd.php
- http://nlp.lsi.upc.edu/awn/AWNDatabaseManagement.py.gz
You run it with:
$ python AWNDatabaseManagement.py -i upc_db.xml
Now to get something like wn.synset('إنتظار')
. Arabic Wordnet has a function wn.get_synsets_from_word(word)
, but it gives offsets. Also it accepts the words only as vocalized in the database. For example, you should use جَمِيل
for جميل
:
>> wn.get_synsets_from_word(u"جَمِيل")
[(u'a', u'300218842')]
300218842
is the offset of the synset of جميل .
I checked for the word إنتظار and seems it doesn't exist in AWN.
More details about using AWN to get synonyms here.
来源:https://stackoverflow.com/questions/34620627/using-arabic-wordnet-for-synonyms-in-python