nltk

What is the NLTK FCFG grammar standard/specification?

无人久伴 提交于 2021-01-29 04:55:58
问题 NLTK (Natural Language Toolkit) lets you parse a FCFG grammar using nltk.FCFG.fromstring([grammar string here]). Where is the FCFG grammar format specification*? I googled it to death, but all I could find was this. *i.e. grammar language specification 回答1: From the demo: >>> from nltk import CFG >>> grammar = CFG.fromstring(""" ... S -> NP VP ... PP -> P NP ... NP -> Det N | NP PP ... VP -> V NP | VP PP ... Det -> 'a' | 'the' ... N -> 'dog' | 'cat' ... V -> 'chased' | 'sat' ... P -> 'on' |

Get a full list of all hyponyms [duplicate]

放肆的年华 提交于 2021-01-29 03:06:18
问题 This question already has answers here : How to get all the hyponyms of a word/synset in python nltk and wordnet? (2 answers) Closed 3 years ago . Is there any way I can get a full list of hyponyms related to a single word? relative = wordnet.synsets("relative", pos='n')[0] hyponyms = ([lemma.name() for synset in relative.hyponyms() for lemma in synset.lemmas()]) This gives me quite a lot of hyponyms, but many that are in the full hyponyms list on Wordnet's online search are not in my list.

Get a full list of all hyponyms [duplicate]

夙愿已清 提交于 2021-01-29 03:04:32
问题 This question already has answers here : How to get all the hyponyms of a word/synset in python nltk and wordnet? (2 answers) Closed 3 years ago . Is there any way I can get a full list of hyponyms related to a single word? relative = wordnet.synsets("relative", pos='n')[0] hyponyms = ([lemma.name() for synset in relative.hyponyms() for lemma in synset.lemmas()]) This gives me quite a lot of hyponyms, but many that are in the full hyponyms list on Wordnet's online search are not in my list.

How to find the lemmas and frequency count of each word in list of sentences in a list?

醉酒当歌 提交于 2021-01-28 12:43:52
问题 I want to find out the lemmas using WordNet Lemmatizer and also I need to compute each word frequency. I am getting the following error. The trace is as follows: TypeError: unhashable type: 'list' Note: The corpus is available on the nltk package itself. What I have tried so far is as follows: import nltk, re import string from collections import Counter from string import punctuation from nltk.tokenize import TweetTokenizer, sent_tokenize, word_tokenize from nltk.corpus import gutenberg,

How to find the lemmas and frequency count of each word in list of sentences in a list?

孤街浪徒 提交于 2021-01-28 12:42:00
问题 I want to find out the lemmas using WordNet Lemmatizer and also I need to compute each word frequency. I am getting the following error. The trace is as follows: TypeError: unhashable type: 'list' Note: The corpus is available on the nltk package itself. What I have tried so far is as follows: import nltk, re import string from collections import Counter from string import punctuation from nltk.tokenize import TweetTokenizer, sent_tokenize, word_tokenize from nltk.corpus import gutenberg,

How to use tokenized sentence as input for Spacy's PoS tagger?

不想你离开。 提交于 2021-01-28 11:00:23
问题 Spacy's pos tagger is really convenient, it can directly tag on raw sentence. import spacy sp = spacy.load('en_core_web_sm') sen = sp(u"I am eating") But I'm using tokenizer from nltk . So how to use a tokenized sentence like ['I', 'am', 'eating'] rather than 'I am eating' for the Spacy's tagger? BTW, where can I found detailed Spacy documentation? I can only find an overview on the official website Thanks. 回答1: There's two options: You write a wrapper around the nltk tokenizer and use it to

How to download all nltk data in google cloud app engine?

假如想象 提交于 2021-01-28 08:31:20
问题 I have a django application which I have deployed using below link, https://cloud.google.com/python/django/flexible-environment But as I am using nltk for text processing, I am getting below error. ********************************************************************* Resource 'taggers/maxent_treebank_pos_tagger/PY3/english.pickle' not found. Please use the NLTK Downloader to obtain the resource: >>> nltk.download() Searched in: - '/root/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share

must capture output of a function that has no return statement

孤者浪人 提交于 2021-01-28 08:13:36
问题 I'm using the NLTK package and it has a function that tells me whether a given sentence is positive, negative, or neutral: from nltk.sentiment.util import demo_liu_hu_lexicon demo_liu_hu_lexicon('Today is a an awesome, happy day') >>> Positive Problem is, that function doesn't have a return statement - it just prints "Positive", "Negative", or "Neutral" to stdout. All it returns - implicitly - is a NoneType object. (Here's the function's source code.) Is there any way I can capture this

How to check if wordnet is already installed?

倖福魔咒の 提交于 2021-01-28 07:54:22
问题 I know that we can check for resources like this: try: nltk.data.find('tokenizers/punkt') except LookupError: nltk.download('punkt') But I can't find the way to do this for wordnet: try: nltk.data.find('wordnet') # ????/wordnet except LookupError: nltk.download('wordnet') How can I make this check? 回答1: You can do: nltk.find('corpora/wordnet') 来源: https://stackoverflow.com/questions/57925041/how-to-check-if-wordnet-is-already-installed

Is there a way to reverse stem in python nltk?

你说的曾经没有我的故事 提交于 2021-01-28 07:52:08
问题 I have a list of stems in NLTK/python and want to get the possible words that create that stem. Is there a way to take a stem and get a list of words that will stem to it in python? 回答1: To the best of my knowledge the answer is No, and depending on the stemmer it might be difficult to come up with an exhaustive search for reverting the effect of the stemming rules and the results would be mostly invalid words by any standard. E.g for Porter stemmer: from nltk.stem.porter import * stemmer =