wordnet

How to automatically label a cluster of words using semantics?

ⅰ亾dé卋堺 提交于 2019-12-12 07:49:36
问题 The context is : I already have clusters of words (phrases actually) resulting from kmeans applied to internet search queries and using common urls in the results of the search engine as a distance (co-occurrence of urls rather than words if I simplify a lot). I would like to automatically label the clusters using semantics, in other words I'd like to extract the main concept surrounding a group of phrases considered together. For example - sorry for the subject of my example - if I have the

Iterate one list of synsets over another

别等时光非礼了梦想. 提交于 2019-12-12 04:53:24
问题 I have two sets of wordnet synsets (contained in two separate list objects, s1 and s2), from which I want to find the maximum path similarity score for each synset in s1 onto s2 with the length of output equal that of s1. For example, if s1 contains 4 synsets, then the length of output should be 4. I have experimented with the following code (so far): import numpy as np import nltk from nltk.corpus import wordnet as wn import pandas as pd #two wordnet synsets (s1, s2) s1 = [wn.synset('be.v.01

Converting list of strings with u'…' to a list of normal strings [duplicate]

僤鯓⒐⒋嵵緔 提交于 2019-12-12 03:15:31
问题 This question already has answers here : What's the u prefix in a Python string? (6 answers) Closed 3 years ago . I'm a newbie in python. And apologies for a very basic question. I'm working with python pattern.en library and try to get the synonyms of a word. this is my code and is working fine. from pattern.en import wordnet a=wordnet.synsets('human') print a[0].synonyms this what the output i get from this: [u'homo', u'man', u'human being', u'human'] but for my program i need to insert

Extract Word from Synset using Wordnet in NLTK 3.0

喜夏-厌秋 提交于 2019-12-12 02:59:48
问题 Some time ago, someone on SO asked how to retrieve a list of words for a given synset using NLTK's wordnet wrapper. Here is one of the suggested responses: for synset in wn.synsets('dog'): print synset.lemmas[0].name Running this code with NLTK 3.0 yields TypeError: 'instancemethod' object is not subscriptable . I tried each of the previously-proposed solutions (each of the solutions described on the page linked above), but each throws an error. I therefore wanted to ask: Is it possible to

How to determine semantic hierarchies / relations in using NLTK?

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-12 01:35:24
问题 I want to use NLTK and wordnet to understand the semantic relation between two words. Like if I enter "employee" and "waiter", it returns something showing that employee is more general than waiter. Or for "employee" and "worker", it returns equal. Does anyone know how to do that? 回答1: Firstly, you have to tackle the problem of getting words into lemmas and then into Synsets, i.e. how can you identify a synset from a word? word => lemma => lemma.pos.sense => synset Waiters => waiter =>

JAWS import issue

谁说胖子不能爱 提交于 2019-12-12 01:31:18
问题 I am using JAWS (Java API for WordNet Searching) and have set up the wordnet dictionary with VM Arguments in eclipse. I downloaded the jaws-bin.jar file and placed it in my project directory C:\Users\My-pc\Projects\MyApp\src. After this, I ran the code successfully using: java -classpath .;C:\Users\My-pc\Projects\MyApp\bin\jaws-bin.jar -Dwordnet.database.dir=C:\WordNet-3.0\dict MyAppName Now when I import the package "edu.smu.tspell.wordnet.*", it is giving the error "The import edu cannot be

JAWS wordnet similarity

时光毁灭记忆、已成空白 提交于 2019-12-12 00:29:25
问题 I use JAWS for normal wordnet to find similarity between words. I installed wordnet 2.1 and I added the jar file : edu.mit.jwi_2.1.4.jar and edu.sussex.nlp.jws.beta.11.jar and I copier the WordNet-2.1-InfoContent in D: \ Program Files \ WordNet \ 2.1 but i have this problem when i run my application Loading modules set up: ... finding noun and verb <roots> ... calculating IC <roots> ... ... ICFinder java.io.FileNotFoundException: D:\Program Files\WordNet\2.1\WordNet-InfoContent-2.1 (Access

extracting synonyms using wordnet

让人想犯罪 __ 提交于 2019-12-11 17:05:40
问题 I am currently working on my thesis and implementing the solution in R language. i have to find synonyms using word-net dictionary library. i get the synonyms against single word but when i try to get synonyms using loop for set of words i get the error "Subscription is out of bond".. kindly if some one can guide me how to get synonyms for against each word in text using loop or is there any other way to do it? here is the code i am trying *my_corpus <- "closure animal wrong carnivore

integrate wordnet with solr

放肆的年华 提交于 2019-12-11 12:28:34
问题 I am trying to integrate wordnet api in to Apache solr. But it is not seems to be working and there is no good documentation as well. Could you please post me the steps if any body has experience on it? 回答1: There are more than one way to do this: 1) https://issues.apache.org/jira/browse/LUCENE-2347 2) https://gist.github.com/562776 These are simple Java classes, which extract the synonyms from WordNet's prolog file - more or less the same way. Hope this helps. Péter 来源: https://stackoverflow

WordNet similarities java

半城伤御伤魂 提交于 2019-12-11 11:03:49
问题 I have found a kool library in Perl http://www.d.umn.edu/~tpederse/similarity.html which perform a lot of similarities between words, there is something like this in java ? 回答1: There is Java WordNet::Similarity (you have to scroll down a bit for the downloads). According to the WordNet project page Java WordNet::Similarity [...] is a pure Java version of Perl WordNet::Similarity (developed by Ted Pedersen). The code supports all of the measures found in the Perl version. 来源: https:/