wordnet

NLTK Wordnet Synset for word phrase

别来无恙 提交于 2019-12-05 08:32:35
I'm working with the Python NLTK Wordnet API. I'm trying to find the best synset that represents a group of words. If I need to find the best synset for something like "school & office supplies", I'm not sure how to go about this. So far I've tried finding the synsets for the individual words and then computing the best lowest common hypernym like this: def find_best_synset(category_name): text = word_tokenize(category_name) tags = pos_tag(text) node_synsets = [] for word, tag in tags: pos = get_wordnet_pos(tag) if not pos: continue node_synsets.append(wordnet.synsets(word, pos=pos)) max_score

Wordnet Lemmatizer for R

与世无争的帅哥 提交于 2019-12-05 06:57:42
问题 I would like to use the wordnet lemmatizer to lemmatize the words in a > a<-c("He saw a see-saw on a sea shore", "she is feeling cold") > a [1] "He saw a see-saw on a sea shore" "she is feeling cold" I convert a into a corpus and do pre-processing steps (like stopword removal, lemmatization etc) > a <- Corpus(VectorSource(a)) I wanted to do the lemmatization in the below way, > filter <- getTermFilter("ExactMatchFilter", a, TRUE) > terms <- getIndexTerms("NOUN", 1, filter) > sapply(terms,

General synonym and part of speech processing using nltk

血红的双手。 提交于 2019-12-05 04:40:28
I'm trying to create a general synonym identifier for the words in a sentence which are significant (i.e. not "a" or "the"), and I am using the natural language toolkit(nltk) in python for it. The problem I am having is that the synonym finder in nltk requires a part of speech argument in order to be linked to its synonyms. My attempted fix for this was to use the simplified part of speech tagger present in nltk, and then reduce the first letter in order to pass this argument into the synonym finder, however this is not working. def synonyms(Sentence): Keywords = [] Equivalence =

word disambiguation algorithm (Lesk algorithm)

有些话、适合烂在心里 提交于 2019-12-05 03:46:35
问题 Hii.. Can anybody help me to find an algorithm in Java code to find synonyms of a search word based on the context and I want to implement the algorithm with WordNet database. For example, "I am running a Java program". From the context, I want to find the synonyms for the word "running", but the synonyms must be suitable according to a context. 回答1: Let me illustrate a possible approach: Let your sentence be A B C Let each word have synsets i.e. {A:(a1, a2, a3), B:(b1), C:(c1, c2)} Now form

How to implement category based text tagging using WordNet or related to wordnet?

家住魔仙堡 提交于 2019-12-04 22:33:36
问题 How to tag text using wordnet by word's category (java as a interfacer ) ? Example Consider the sentences: 1) Computers need keyboard , moniter , CPU to work. 2) Automobile uses gears and clutch . Now my objective is , the example sentences have to be tagged as 1st sentence Computer/electronic keyboard/electronic CPU / electronic 2nd sentence Automobile / mechanical gears / mechanical clutch / mechanical some extra example ... "Clutch and gear is monitored using microchip " -> clutch

Compiling WordNet 3.0 on OSX 10.8.5

北城以北 提交于 2019-12-04 18:14:59
I'm trying to get WordNet to work on my Notebook but it fails in the make step of the process as follows: WaldundWiesenComputer:WordNet-3.0 Gnaddel$ make make all-recursive Making all in doc Making all in html make[3]: Nothing to be done for `all'. Making all in man make[3]: Nothing to be done for `all'. Making all in pdf make[3]: Nothing to be done for `all'. Making all in ps make[3]: Nothing to be done for `all'. make[3]: Nothing to be done for `all-am'. Making all in dict make[2]: Nothing to be done for `all'. Making all in include Making all in tk make[3]: Nothing to be done for `all'.

How do I include pronouns and other types of words in Wordnet?

只愿长相守 提交于 2019-12-04 18:04:24
I am using Princeton's WordNet for an application, but there is no support for pronouns, conjunctions, and several other types of words within the database. Does anybody know if there is a way to supplement the Wordnet database with these types of words? Thanks, Ted First, they're missing since WordNet only contains the open-class words as described in their page: Q. Why is WordNet missing: of, an, the, and, about, above, because, etc. A. WordNet only contains "open-class words": nouns, verbs, adjectives, and adverbs. Thus, excluded words include determiners, prepositions, pronouns,

Multi Threading in NLTK WordNetLemmatizer?

自作多情 提交于 2019-12-04 15:51:11
I am trying to use multi threading to speed up the process. I am using the wordnetlemmatizer to lemmatize the words and those words can be further used by sentiwordnet to calculate the sentiment of the text. My Sentiment analysis function where I am using the WordNetLemmatizer is as follows: import nltk from nltk.corpus import sentiwordnet as swn def SentimentA(doc, file_path): sentences = nltk.sent_tokenize(doc) # print(sentences) stokens = [nltk.word_tokenize(sent) for sent in sentences] taggedlist = [] for stoken in stokens: taggedlist.append(nltk.pos_tag(stoken)) wnl = nltk

Getting word stems with JWI and Wordnet

大城市里の小女人 提交于 2019-12-04 11:32:11
How do I correctly use the stemmer method implemented in MIT's JWI (Java API for WordNet) in order to get the stem of a word? I'm not sure how to initialize a stemmer and use the findStems method. You don't need an additional library, but you do need a dictionary. You can download one from Princeton: https://wordnet.princeton.edu/wordnet/download/current-version/ I recommend downloading only the dictionary from the section "WordNet 3.1 DATABASE FILES ONLY" Extract the archive. Supposing that PATH/dict is the location of the output you can use this code: Dictionary dict = new Dictionary(new

Trying to find synonyms using wordnet java api

让人想犯罪 __ 提交于 2019-12-04 10:03:25
I am trying to find synonyms of some words(String type) in java using Wordnet java api. I have difficulties though in figuring out how it works. I found this link http://lyle.smu.edu/~tspell/jaws/doc/edu/smu/tspell/wordnet/impl/file/ReferenceSynset.html#getTagCount%28java.lang.String%29 which I though it is useful, but still I don't know how to start. Do I have to create a ReferenceSynset object and then find its synonyms, and how can this be done? Or is there another easier way? Please help! Thanks in advance! JAWS - "Java API for WordNet Searching" has been created exactly for this purpose.