nltk | 易学教程

What is the NLTK FCFG grammar standard/specification?

阅读更多关于 What is the NLTK FCFG grammar standard/specification?

问题 NLTK (Natural Language Toolkit) lets you parse a FCFG grammar using nltk.FCFG.fromstring([grammar string here]). Where is the FCFG grammar format specification*? I googled it to death, but all I could find was this. *i.e. grammar language specification 回答1: From the demo: >>> from nltk import CFG >>> grammar = CFG.fromstring(""" ... S -> NP VP ... PP -> P NP ... NP -> Det N | NP PP ... VP -> V NP | VP PP ... Det -> 'a' | 'the' ... N -> 'dog' | 'cat' ... V -> 'chased' | 'sat' ... P -> 'on' |

Get a full list of all hyponyms [duplicate]

阅读更多关于 Get a full list of all hyponyms [duplicate]

问题 This question already has answers here : How to get all the hyponyms of a word/synset in python nltk and wordnet? (2 answers) Closed 3 years ago . Is there any way I can get a full list of hyponyms related to a single word? relative = wordnet.synsets("relative", pos='n')[0] hyponyms = ([lemma.name() for synset in relative.hyponyms() for lemma in synset.lemmas()]) This gives me quite a lot of hyponyms, but many that are in the full hyponyms list on Wordnet's online search are not in my list.

Get a full list of all hyponyms [duplicate]

阅读更多关于 Get a full list of all hyponyms [duplicate]

How to find the lemmas and frequency count of each word in list of sentences in a list?

阅读更多关于 How to find the lemmas and frequency count of each word in list of sentences in a list?

问题 I want to find out the lemmas using WordNet Lemmatizer and also I need to compute each word frequency. I am getting the following error. The trace is as follows: TypeError: unhashable type: 'list' Note: The corpus is available on the nltk package itself. What I have tried so far is as follows: import nltk, re import string from collections import Counter from string import punctuation from nltk.tokenize import TweetTokenizer, sent_tokenize, word_tokenize from nltk.corpus import gutenberg,

How to find the lemmas and frequency count of each word in list of sentences in a list?

阅读更多关于 How to find the lemmas and frequency count of each word in list of sentences in a list?

How to use tokenized sentence as input for Spacy's PoS tagger?

阅读更多关于 How to use tokenized sentence as input for Spacy's PoS tagger?

问题 Spacy's pos tagger is really convenient, it can directly tag on raw sentence. import spacy sp = spacy.load('en_core_web_sm') sen = sp(u"I am eating") But I'm using tokenizer from nltk . So how to use a tokenized sentence like ['I', 'am', 'eating'] rather than 'I am eating' for the Spacy's tagger? BTW, where can I found detailed Spacy documentation? I can only find an overview on the official website Thanks. 回答1: There's two options: You write a wrapper around the nltk tokenizer and use it to

How to download all nltk data in google cloud app engine?

阅读更多关于 How to download all nltk data in google cloud app engine?

问题 I have a django application which I have deployed using below link, https://cloud.google.com/python/django/flexible-environment But as I am using nltk for text processing, I am getting below error. ********************************************************************* Resource 'taggers/maxent_treebank_pos_tagger/PY3/english.pickle' not found. Please use the NLTK Downloader to obtain the resource: >>> nltk.download() Searched in: - '/root/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share

must capture output of a function that has no return statement

阅读更多关于 must capture output of a function that has no return statement

问题 I'm using the NLTK package and it has a function that tells me whether a given sentence is positive, negative, or neutral: from nltk.sentiment.util import demo_liu_hu_lexicon demo_liu_hu_lexicon('Today is a an awesome, happy day') >>> Positive Problem is, that function doesn't have a return statement - it just prints "Positive", "Negative", or "Neutral" to stdout. All it returns - implicitly - is a NoneType object. (Here's the function's source code.) Is there any way I can capture this

How to check if wordnet is already installed?

阅读更多关于 How to check if wordnet is already installed?

问题 I know that we can check for resources like this: try: nltk.data.find('tokenizers/punkt') except LookupError: nltk.download('punkt') But I can't find the way to do this for wordnet: try: nltk.data.find('wordnet') # ????/wordnet except LookupError: nltk.download('wordnet') How can I make this check? 回答1: You can do: nltk.find('corpora/wordnet') 来源： https://stackoverflow.com/questions/57925041/how-to-check-if-wordnet-is-already-installed

Is there a way to reverse stem in python nltk?

阅读更多关于 Is there a way to reverse stem in python nltk?

问题 I have a list of stems in NLTK/python and want to get the possible words that create that stem. Is there a way to take a stem and get a list of words that will stem to it in python? 回答1: To the best of my knowledge the answer is No, and depending on the stemmer it might be difficult to come up with an exhaustive search for reverting the effect of the stemming rules and the results would be mostly invalid words by any standard. E.g for Porter stemmer: from nltk.stem.porter import * stemmer =