Import Arabic Wordnet in python

只愿长相守 提交于 2019-11-29 04:50:05

AWNDatabaseManagement.py should be fed by the argument -i that has the Arabic WordNet as a value. If the argument is not specified, it will use a default path E:/usuaris/horacio/arabicWN/AWNdatabase/upc_db.xml.

So to resolve that, download the xml database of Arabic WordNet upc_db.xml . I suggest to place it in the same folder with the script AWNDatabaseManagement.py. Then,run:

$ python AWNDatabaseManagement.py -i upc_db.xml

This what I got after running it, no errors:

processing file  upc_db.xml
<open file 'upc_db.xml', mode 'r' at 0xb74689c0>

You can also change the line 320

opts['i']='E:/usuaris/horacio/arabicWN/AWNdatabase/upc_db.xml'

to

opts['i']='upc_db.xml'

and then run the script without -i

You can load it:

>> from AWNDatabaseManagement import wn

if it fails, check that you are putting the xml resource in the right path.


Now to get something like wn.synset('جميل'). Arabic Wordnet has a function wn.get_synsets_from_word(word), but it gives offsets. Also it accepts the words only as vocalized in the database. For example, you should use جَمِيل not جميل:

>> wn.get_synsets_from_word(u"جَمِيل")
[(u'a', u'300218842')]

300218842 is the offset of the synset of جميل . I suggest to use the next method instead. list words by:

 >> for word,ids  in sorted(wn.get_words(False)):
 ..     print word, ids 

you will get a result like this:

 جَمِيعَة [u'jamiyEap_1']
 جَمِيل [u'jamiyl_1']
 جَمِيْعَة [u'jamiyoEap_1']
 جَمَّدَ [u'jam~ada_2', u'jam~ada_1']

Choose your word, and pick an id of its ids. IDs are written in Buckwalter romanization. Many ids means the word has different meanings. Describe the chosen word by:

>> wn._words["jamiyl_1"].describe()
wordid  jamiyl_1
value  جَمِيل
synsets  [u'jamiyl_a1AR']
forms  [(u'root', u'\u062c\u0645\u0644')]

Now you have the synsets list. For more information about a synset, use:

>> wn._items["jamiyl_a1AR"].describe()
itemid  jamiyl_a1AR
offset  300218842
name  جَمِيل
type  synset
pos  a
input links  [[u'be_in_state', u'jamaAl_n1AR'], [u'near_antonym', u'qabiyH_a1AR']]
output links  [[u'near_antonym', u'qabiyH_a1AR']]
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!