Add words to a local copy of WordNet

三世轮回 提交于 2021-02-07 03:26:49

问题


I am using WordNet, accessed through Python's NLTK to compare the synsets of words from social media. Many of those words aren't in the version of WordNet that NLTK connects to.

When I say I words I mean domain-specific terms, not abbreviations or emoticons.

I've compiled a list of these words and would like to merge that list with WordNet.

Searching for prior efforts turns up on attempts to develop methods of automatically updating WordNet.

The steps I imagine are:

  1. Clone the WordNet db
  2. Write an extension of the WordNet module that looks for a local copy
  3. Update that local copy.

How reasonable does this sound?


回答1:


I haven't changed WordNet myself yet, but I had good experiences working with the Multilingual Central Repository, and I believe you should be able to do what you want using that.

It contains the data files for WordNet 3.0 in several languages including English, which have been tied to each other through so-called Inter-Lingual Indexes (ILI). The data files can be loaded into a MySQL or PostgreSQL database tables, from which point it should be relatively easy not just to query it using SQL commands, but to insert new items, maintaining correspondence between tables. You can of course export the changed database as well, e.g. into CSV files, if using SQL is not enough for your purposes.



来源:https://stackoverflow.com/questions/20749730/add-words-to-a-local-copy-of-wordnet

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!