wordnet | 易学教程

Stemmers vs Lemmatizers

阅读更多关于 Stemmers vs Lemmatizers

问题 Natural Language Processing (NLP), especially for English, has evolved into the stage where stemming would become an archaic technology if "perfect" lemmatizers exist. It's because stemmers change the surface form of a word/token into some meaningless stems. Then again the definition of the "perfect" lemmatizer is questionable because different NLP task would have required different level of lemmatization. E.g. Convert words between verb/noun/adjective forms. Stemmers [in]: having [out]: hav

How to do case conversion in Prolog?

阅读更多关于 How to do case conversion in Prolog?

问题 I'm interfacing with WordNet, and some of the terms I'd like to classify (various proper names) are capitalised in the database, but the input I get may not be capitalised properly. My initial idea here is to write a predicate that produces the various capitalisations possible of an input, but I'm not sure how to go about it. Does anyone have an idea how to go about this, or even better, a more efficient way to achieve what I would like to do? 回答1: It depends on what Prolog implementation you

How to compare output of wordnet.synsets?

阅读更多关于 How to compare output of wordnet.synsets?

问题 I want compare output from the wordnet.synset function in the NLTK library. In an example when I run: from nltk.corpus import wordnet as wn wn.synsets('dog') I get output: output: [Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')] Now if I try: from nltk.corpus import wordnet as wn wn.synsets('dogg') I get output output: [] How can I compare the outputs in the console to

Synonyms from wordnet package are very incomplete

阅读更多关于 Synonyms from wordnet package are very incomplete

问题 Synonyms function of wordnet package misses obvious synonyms that the wordnet app does not. How to get that other data out of wordnet using R's wordnet package? I am using R 3.4.4. App shows 9 senses of company Sense 1 company -- (an institution created to conduct business; "he only invests in large well-established companies"; "he started the company in his garage") => institution, establishment -- (an organization founded and united for a specific purpose) Sense 2 company -- (small military

Extract “emotion words” / affect words from english corpus?

阅读更多关于 Extract “emotion words” / affect words from english corpus?

问题 I have lots of English language text and am looking for a way to extract the words that have emotional content, such as "anger," "hate," "paranoid," "exited," and so on. Is there a way to do this with NLTK or WordNet? 回答1: You can use SentiWordNet Interface in NLTK to check for the emotional content of an English word. Usage from NLTK. >>> from nltk.corpus import sentiwordnet as swn >>> list(swn.senti_synsets('breakdown')) [SentiSynset('dislocation.n.02'), SentiSynset('breakdown.n.02'),

WordNet: file was built for i386 which is not the architecture being linked (x86_64)

阅读更多关于 WordNet: file was built for i386 which is not the architecture being linked (x86_64)

问题 I get the following error when running make trying to compile WordNet 3.0: gcc -m64 -g -O2 -o wishwn wishwn-tkAppInit.o wishwn-stubs.o -L../lib -lWN - F/Library/Frameworks -framework Tk -F/Library/Frameworks -framework Tcl -lpthread -framework CoreFoundation -framework Cocoa -framework Carbon -framework IOKit -lz -lpthread -framework CoreFoundation ld: warning: ignoring file /Library/Frameworks/Tk.framework/Tk, file was built for i386 which is not the architecture being linked (x86_64):

How to get word hierarchy (e.g., hypernyms, hyponyms) using wordnet in R

阅读更多关于 How to get word hierarchy (e.g., hypernyms, hyponyms) using wordnet in R

问题 I want to use the wordnet package in R to get the word hierarchies like: "animal" is the hypernym of "cat", and "apple" is the hyponym of "fruit". But the code I can find from R wordnet help file is like below to identify antonyms: install.packages("wordnet", dependencies=TRUE) library(wordnet) filter <- getTermFilter("ExactMatchFilter", "cold", TRUE) terms <- getIndexTerms("ADJECTIVE", 5, filter) synsets <- getSynsets(terms[[1]]) related <- getRelatedSynsets(synsets[[1]],"!") sapply(related,

Can WordNetLemmatizer in Nltk stem words?

阅读更多关于 Can WordNetLemmatizer in Nltk stem words?

问题 I want to find word stems with Wordnet . Does wordnet have a function for stemming? I use this import for my stemming, but it doesn't work as expected. from nltk.stem.wordnet import WordNetLemmatizer WordNetLemmatizer().lemmatize('Having','v') 回答1: Try using one of the stemmers in nltk.stem module, such as the PorterStemmer. Here's an online demo of NLTK's stemmers: http://text-processing.com/demo/stem/ 回答2: Seems like you have to input a lowercase string to the lemmatize method: >>>

NLTK data out of date - Python 3.4

阅读更多关于 NLTK data out of date - Python 3.4

问题 I'm trying to install NLTK for Python 3.4. The actual NLTK module appears to have installed fine. I then ran import nltk nltk.download() and chose to download everything. However, after it was done, the window simply says 'out of date'. I tried refreshing and downloading, yet it stays 'out of date' as shown here:NLTK Window 1 I looked online and tried various fixes, but I haven't found any that helped my case yet. I also tried to manually find the missing parts, which turned out to be 'Open

could not find Wordnet dictionary error

阅读更多关于 could not find Wordnet dictionary error

问题 I'm having trouble running wordnet in R. I loaded it into the library initially, but it didn't work. The error looked like this: Warning message: In initDict() : cannot find WordNet 'dict' directory: please set the environment variable WNHOME to its parent So, I added this line: Sys.setenv(WNHOME = "C:\\Program Files (x86)\\WordNet\\2.1") and then was able to use the library function to load it. I don't understand this line or error message at all, but it seems to fix this problem. However,