pos-tagger | 易学教程

What is NLTK POS tagger asking me to download?

阅读更多关于 What is NLTK POS tagger asking me to download?

I just started using a part-of-speech tagger, and I am facing many problems. I started POS tagging with the following: import nltk text=nltk.word_tokenize("We are going out.Just you and me.") When I want to print 'text' , the following happens: print nltk.pos_tag(text) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "F:\Python26\lib\site-packages\nltk\tag\__init__.py", line 63, in pos_tag tagger = nltk.data.load(_POS_TAGGER) File "F:\Python26\lib\site-packages\nltk\data.py", line 594, in load resource_val = pickle.load(_open(resource_url)) File "F:\Python26\lib\site

NLTK v3.2: Unable to nltk.pos_tag()

阅读更多关于 NLTK v3.2: Unable to nltk.pos_tag()

Hi text mining champions, I'm using Anaconda with NLTK v3.2 on Windows 10.(client's environment) When I try to POS tag, I keep getting a URLLIB2 error: URLError: <urlopen error unknown url type: c> It seems urllib2 is unable to recognize windows paths? How can I work around this? The command is simple as: nltk.pos_tag(nltk.word_tokenize("Hello World")) edit: There is a duplicate question, however I think the answers obtained here by manan and alvas are a better fix. MananVyas EDITED This issue has been resolved from NLTK v3.2.1. Upgrading your NLTK version would resolve the issue, e.g. pip

TreeTagger installation successful but cannot open .par file

阅读更多关于 TreeTagger installation successful but cannot open .par file

Do anyone know how to resolve this file reading error in TreeTagger that is a common Natural Language Processing tool used to POS tag, lemmatize and chunk sentences? alvas@ikoma:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english reading parameters ... ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par aborted. I didn't encounter any possible installation problems as hinted on http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/installation-hints.txt . I've followed the instructions on the webpage and it's installed properly ( http://www.ims.uni-stuttgart.de

POS-Tagger is incredibly slow

阅读更多关于 POS-Tagger is incredibly slow

问题 I am using nltk to generate n-grams from sentences by first removing given stop words. However, nltk.pos_tag() is extremely slow taking up to 0.6 sec on my CPU (Intel i7). The output: ['The first time I went, and was completely taken by the live jazz band and atmosphere, I ordered the Lobster Cobb Salad.'] 0.620481014252 ["It's simply the best meal in NYC."] 0.640982151031 ['You cannot go wrong at the Red Eye Grill.'] 0.644664049149 The code: for sentence in source: nltk_ngrams = None if stop

Extracting noun+noun or (adj|noun)+noun from Text

阅读更多关于 Extracting noun+noun or (adj|noun)+noun from Text

问题 I would like to query if it is possible to extract noun+noun or (adj|noun)+noun in R package openNLP?That is, I would like to use linguistic filtering to extract candidate noun phrases. Could you direct me how to do? Many thanks. Thanks for the responses. here is the code: library("openNLP") acq <- "Gulf Applied Technologies Inc said it sold its subsidiaries engaged in pipeline and terminal operations for 12.2 mln dlrs. The company said the sale is subject to certain post closing adjustments,

How to apply pos_tag_sents() to pandas dataframe efficiently

阅读更多关于 How to apply pos_tag_sents() to pandas dataframe efficiently

问题 In situations where you wish to POS tag a column of text stored in a pandas dataframe with 1 sentence per row the majority of implementations on SO use the apply method dfData['POSTags']= dfData['SourceText'].apply( lamda row: [pos_tag(word_tokenize(row) for item in row]) The NLTK documentation recommends using the pos_tag_sents() for efficient tagging of more than one sentence. Does that apply to this example and if so would the code be as simple as changing pso_tag to pos_tag_sents or does

What is NLTK POS tagger asking me to download?

阅读更多关于 What is NLTK POS tagger asking me to download?

问题 I just started using a part-of-speech tagger, and I am facing many problems. I started POS tagging with the following: import nltk text=nltk.word_tokenize("We are going out.Just you and me.") When I want to print 'text' , the following happens: print nltk.pos_tag(text) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "F:\Python26\lib\site-packages\nltk\tag\__init__.py", line 63, in pos_tag tagger = nltk.data.load(_POS_TAGGER) File "F:\Python26\lib\site-packages\nltk

NLTK v3.2: Unable to nltk.pos_tag()

阅读更多关于 NLTK v3.2: Unable to nltk.pos_tag()

问题 Hi text mining champions, I'm using Anaconda with NLTK v3.2 on Windows 10.(client's environment) When I try to POS tag, I keep getting a URLLIB2 error: URLError: <urlopen error unknown url type: c> It seems urllib2 is unable to recognize windows paths? How can I work around this? The command is simple as: nltk.pos_tag(nltk.word_tokenize("Hello World")) edit: There is a duplicate question, however I think the answers obtained here by manan and alvas are a better fix. 回答1: EDITED This issue has

TreeTagger installation successful but cannot open .par file

阅读更多关于 TreeTagger installation successful but cannot open .par file

问题 Do anyone know how to resolve this file reading error in TreeTagger that is a common Natural Language Processing tool used to POS tag, lemmatize and chunk sentences? alvas@ikoma:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english reading parameters ... ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par aborted. I didn't encounter any possible installation problems as hinted on http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/installation-hints.txt. I've

Python NLTK pos_tag not returning the correct part-of-speech tag

阅读更多关于 Python NLTK pos_tag not returning the correct part-of-speech tag

问题 Having this: text = word_tokenize(\"The quick brown fox jumps over the lazy dog\") And running: nltk.pos_tag(text) I get: [(\'The\', \'DT\'), (\'quick\', \'NN\'), (\'brown\', \'NN\'), (\'fox\', \'NN\'), (\'jumps\', \'NNS\'), (\'over\', \'IN\'), (\'the\', \'DT\'), (\'lazy\', \'NN\'), (\'dog\', \'NN\')] This is incorrect. The tags for quick brown lazy in the sentence should be: (\'quick\', \'JJ\'), (\'brown\', \'JJ\') , (\'lazy\', \'JJ\') Testing this through their online tool gives the same