wordnet | 易学教程

Wordnet (Word Sense Annotated) Corpus

阅读更多关于 Wordnet (Word Sense Annotated) Corpus

问题 I've been utilizing lots of different corpora for natural language processing, and I've been looking for a corpus that has been annotated with Wordnet Word Senses. I understand that there probably is not a big corpus with this information, since the corpus needs to be built up manually, but there has to be something to go off of. Also if there isn't a corpus in existence, is there at least a sense annotated ngram database (with what percentage of the time a word is each of its definitions, or

Finding the synonyms for words in wordnet

阅读更多关于 Finding the synonyms for words in wordnet

问题 I try to use Wordnet as a thesarus, so I have a list of words and I need to collect for every word its synonyms. I tried this from nltk.corpus import wordnet as wn for i,j in enumerate(wn.synsets('dog')): print (j.lemma_names) This code gives the following output <bound method Synset.lemma_names of Synset('dog.n.01')> <bound method Synset.lemma_names of Synset('frump.n.01')> <bound method Synset.lemma_names of Synset('dog.n.03')> <bound method Synset.lemma_names of Synset('cad.n.01')> <bound

R error in lemmatizzation a corpus of document with wordnet

阅读更多关于 R error in lemmatizzation a corpus of document with wordnet

问题 i'm trying to lemmatizzate a corpus of document in R with wordnet library. This is the code: corpus.documents <- Corpus(VectorSource(vector.documents)) corpus.documents <- tm_map(corpus.documents removePunctuation) library(wordnet) lapply(corpus.documents,function(x){ x.filter <- getTermFilter("ContainsFilter", x, TRUE) terms <- getIndexTerms("NOUN", 1, x.filter) sapply(terms, getLemma) }) but when running this. I have this error: Errore in .jnew(paste("com.nexagis.jawbone.filter", type, sep

Getting all nouns related to a verb in WordNet using JWNL

阅读更多关于 Getting all nouns related to a verb in WordNet using JWNL

问题 I am using JWNL (1.4.1 rc2). Given a verb, I need to find "related" nouns. For example, given the verb: bear I want the noun birth . I can see this through the WordNet online interface: http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=bear&i=8&h=000100000000000000000#c. How would this be done in JWNL. 回答1: You can use the Synset for each sense of the word, then print out the word in each Synset , as follows: IndexWord indexWord = proc.lookupBaseForm(POS

Extracting synonymous terms from wordnet using synonym()

阅读更多关于 Extracting synonymous terms from wordnet using synonym()

问题 Supposed I am pulling the synonyms of "help" by the function of synonyms() from wordnet and get the followings: Str = synonyms("help") Str [1] "c(\"aid\", \"assist\", \"assistance\", \"help\")" [2] "c(\"aid\", \"assistance\", \"help\")" [3] "c(\"assistant\", \"helper\", \"help\", \"supporter\")" [4] "c(\"avail\", \"help\", \"service\")" Then I can get a one character string using unique(unlist(lapply(parse(text=Str),eval))) at the end that looks like this: [1] "aid" "assist" "assistance"

Python 3 and NLTK with WordNet 2.1 - is that possible?

阅读更多关于 Python 3 and NLTK with WordNet 2.1 - is that possible?

问题 I use Python 3 and NLTK 3.0.0 with WordNet 3.0. I would like to use this data (semeval2007) with WordNet 2.1. Is that possible to use WordNet 2.1 with Python 3? Is that possible to replace WordNet 3.0 with WordNet 2.1? How can i do that? 回答1: You can use WordNetCorpusReader to load specific version of wordnet. from nltk.corpus import WordNetCorpusReader wn2 = WordNetCorpusReader("WordNet-2.0/dict", nltk.data.find("WordNet-2.0/dict")) print wn2.get_version() 来源： https://stackoverflow.com

nltk “OMW” wordnet with Arabic language

阅读更多关于 nltk “OMW” wordnet with Arabic language

问题 I'm working on python/nltk with (OMW) wordnet specifically for The Arabic language. All the functions work fine with the English language yet I can't seem to be able to perform any of them when I use the 'arb' tag. The only thing that works great is extracting the lemma_names from a given Arabic synset. The code below works fine with u'arb': The output is a list of Arabic lemmas. for synset in wn.synsets(u'عام',lang=('arb')): for lemma in synset.lemma_names(u'arb'): print lemma When I try to

WordNetLemmatizer not returning the right lemma unless POS is explicit - Python NLTK

阅读更多关于 WordNetLemmatizer not returning the right lemma unless POS is explicit - Python NLTK

问题 I'm lemmatizing the Ted Dataset Transcript. There's something strange I notice: Not all words are being lemmatized. To say, selected -> select Which is right. However, involved !-> involve and horsing !-> horse unless I explicitly input the 'v' (Verb) attribute. On the python terminal, I get the right output but not in my code: >>> from nltk.stem import WordNetLemmatizer >>> from nltk.corpus import wordnet >>> lem = WordNetLemmatizer() >>> lem.lemmatize('involved','v') u'involve' >>> lem

How to use the Spanish Wordnet in NLTK?

阅读更多关于 How to use the Spanish Wordnet in NLTK?

问题 I just downloaded a Spanish Wordnet from the project GRIAL, the format is XML. How can I use it in Python NLTK? Besides that, in the same page you can download a tagged corpus in Spanish. How can I incorporate it as well? 回答1: Use XMLCorpusReader to load XML data as corpus Here's the code to do that from nltk.corpus.reader import XMLCorpusReader reader = XMLCorpusReader(dir, file) A fully working example which uses XMLCorpusReader is given here 来源： https://stackoverflow.com/questions/25615741

Python: Passing variables into Wordnet Synsets methods in NLTK

阅读更多关于 Python: Passing variables into Wordnet Synsets methods in NLTK

问题 I need to work on a project that require NLTK so I started learning Python two weeks ago but struggling to understand Python and NLTK. From the NLTK documentation, I can understand the following codes and they work well if I manually add the word apple and pear into the codes below. from nltk.corpus import wordnet as wn apple = wn.synset('apple.n.01') pear = wn.synset('pear.n.01') print apple.lch_similarity(pear) Output: 2.53897387106 However, I need to use the NLTK to work with a list of