pos-tagger

What is NLTK POS tagger asking me to download?

丶灬走出姿态 提交于 2019-11-27 20:27:00
I just started using a part-of-speech tagger, and I am facing many problems. I started POS tagging with the following: import nltk text=nltk.word_tokenize("We are going out.Just you and me.") When I want to print 'text' , the following happens: print nltk.pos_tag(text) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "F:\Python26\lib\site-packages\nltk\tag\__init__.py", line 63, in pos_tag tagger = nltk.data.load(_POS_TAGGER) File "F:\Python26\lib\site-packages\nltk\data.py", line 594, in load resource_val = pickle.load(_open(resource_url)) File "F:\Python26\lib\site

NLTK v3.2: Unable to nltk.pos_tag()

好久不见. 提交于 2019-11-27 09:46:07
Hi text mining champions, I'm using Anaconda with NLTK v3.2 on Windows 10.(client's environment) When I try to POS tag, I keep getting a URLLIB2 error: URLError: <urlopen error unknown url type: c> It seems urllib2 is unable to recognize windows paths? How can I work around this? The command is simple as: nltk.pos_tag(nltk.word_tokenize("Hello World")) edit: There is a duplicate question, however I think the answers obtained here by manan and alvas are a better fix. MananVyas EDITED This issue has been resolved from NLTK v3.2.1. Upgrading your NLTK version would resolve the issue, e.g. pip

TreeTagger installation successful but cannot open .par file

眉间皱痕 提交于 2019-11-27 09:45:43
Do anyone know how to resolve this file reading error in TreeTagger that is a common Natural Language Processing tool used to POS tag, lemmatize and chunk sentences? alvas@ikoma:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english reading parameters ... ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par aborted. I didn't encounter any possible installation problems as hinted on http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/installation-hints.txt . I've followed the instructions on the webpage and it's installed properly ( http://www.ims.uni-stuttgart.de

POS-Tagger is incredibly slow

こ雲淡風輕ζ 提交于 2019-11-27 06:07:43
问题 I am using nltk to generate n-grams from sentences by first removing given stop words. However, nltk.pos_tag() is extremely slow taking up to 0.6 sec on my CPU (Intel i7). The output: ['The first time I went, and was completely taken by the live jazz band and atmosphere, I ordered the Lobster Cobb Salad.'] 0.620481014252 ["It's simply the best meal in NYC."] 0.640982151031 ['You cannot go wrong at the Red Eye Grill.'] 0.644664049149 The code: for sentence in source: nltk_ngrams = None if stop

Extracting noun+noun or (adj|noun)+noun from Text

北慕城南 提交于 2019-11-27 01:41:48
问题 I would like to query if it is possible to extract noun+noun or (adj|noun)+noun in R package openNLP?That is, I would like to use linguistic filtering to extract candidate noun phrases. Could you direct me how to do? Many thanks. Thanks for the responses. here is the code: library("openNLP") acq <- "Gulf Applied Technologies Inc said it sold its subsidiaries engaged in pipeline and terminal operations for 12.2 mln dlrs. The company said the sale is subject to certain post closing adjustments,

How to apply pos_tag_sents() to pandas dataframe efficiently

為{幸葍}努か 提交于 2019-11-26 23:21:55
问题 In situations where you wish to POS tag a column of text stored in a pandas dataframe with 1 sentence per row the majority of implementations on SO use the apply method dfData['POSTags']= dfData['SourceText'].apply( lamda row: [pos_tag(word_tokenize(row) for item in row]) The NLTK documentation recommends using the pos_tag_sents() for efficient tagging of more than one sentence. Does that apply to this example and if so would the code be as simple as changing pso_tag to pos_tag_sents or does

What is NLTK POS tagger asking me to download?

瘦欲@ 提交于 2019-11-26 20:21:06
问题 I just started using a part-of-speech tagger, and I am facing many problems. I started POS tagging with the following: import nltk text=nltk.word_tokenize("We are going out.Just you and me.") When I want to print 'text' , the following happens: print nltk.pos_tag(text) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "F:\Python26\lib\site-packages\nltk\tag\__init__.py", line 63, in pos_tag tagger = nltk.data.load(_POS_TAGGER) File "F:\Python26\lib\site-packages\nltk

NLTK v3.2: Unable to nltk.pos_tag()

故事扮演 提交于 2019-11-26 14:54:13
问题 Hi text mining champions, I'm using Anaconda with NLTK v3.2 on Windows 10.(client's environment) When I try to POS tag, I keep getting a URLLIB2 error: URLError: <urlopen error unknown url type: c> It seems urllib2 is unable to recognize windows paths? How can I work around this? The command is simple as: nltk.pos_tag(nltk.word_tokenize("Hello World")) edit: There is a duplicate question, however I think the answers obtained here by manan and alvas are a better fix. 回答1: EDITED This issue has

TreeTagger installation successful but cannot open .par file

梦想的初衷 提交于 2019-11-26 14:49:52
问题 Do anyone know how to resolve this file reading error in TreeTagger that is a common Natural Language Processing tool used to POS tag, lemmatize and chunk sentences? alvas@ikoma:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english reading parameters ... ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par aborted. I didn't encounter any possible installation problems as hinted on http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/installation-hints.txt. I've

Python NLTK pos_tag not returning the correct part-of-speech tag

ぃ、小莉子 提交于 2019-11-26 01:36:04
问题 Having this: text = word_tokenize(\"The quick brown fox jumps over the lazy dog\") And running: nltk.pos_tag(text) I get: [(\'The\', \'DT\'), (\'quick\', \'NN\'), (\'brown\', \'NN\'), (\'fox\', \'NN\'), (\'jumps\', \'NNS\'), (\'over\', \'IN\'), (\'the\', \'DT\'), (\'lazy\', \'NN\'), (\'dog\', \'NN\')] This is incorrect. The tags for quick brown lazy in the sentence should be: (\'quick\', \'JJ\'), (\'brown\', \'JJ\') , (\'lazy\', \'JJ\') Testing this through their online tool gives the same