sentiment-analysis | 易学教程

How to edit NLTKs VADER sentiment lexicon without modifying a txt file

阅读更多关于 How to edit NLTKs VADER sentiment lexicon without modifying a txt file

问题 I know you can add your own words by manually adding them to the vader_lexicon.txt file. I was wondering if there was another way that you can do it in the python code as I don't want people who use my code need to then go modify other .txt files. from nltk.sentiment.vader import SentimentIntensityAnalyzer as SIA sia = SIA() sia.lexicon This will get the dict. Was thinking something like: sia.lexicon.update{u'word': 3} 回答1: For anyone else: from nltk.sentiment.vader import

Testing the Keras sentiment classification with model.predict

阅读更多关于 Testing the Keras sentiment classification with model.predict

问题 I have trained the imdb_lstm.py on my PC. Now I want to test the trained network by inputting some text of my own. How do I do it? Thank you! 回答1: So what you basically need to do is as follows: Tokenize sequnces: convert the string into words (features): For example: "hello my name is georgio" to ["hello", "my", "name", "is", "georgio"]. Next, you want to remove stop words (check Google for what stop words are). This stage is optional, it may lead to faulty results but I think it worth a try

Python: Loaded NLTK Classifier not working

阅读更多关于 Python: Loaded NLTK Classifier not working

问题 I'm trying to train a NLTK classifier for sentiment analysis and then save the classifier using pickle. The freshly trained classifier works fine. However, if I load a saved classifier the classifier will either output 'positive', or 'negative' for ALL examples. I'm saving the classifier using classifier = nltk.NaiveBayesClassifier.train(training_set) classifier.classify(words_in_tweet) f = open('classifier.pickle', 'wb') pickle.dump(classifier, f) f.close() and loading the classifier using f

How to identify adjectives or adverbs?

阅读更多关于 How to identify adjectives or adverbs?

问题 I am quite novice to NLP....Is there any API or a way in which i could identify verb or adjective or adverbs from a sentence? I need it in a project? 回答1: You will need a Part-of-speech Tagger (POSTagger). This identifies the role of every word in the sentence. Wikipedia has an excellent list of NLP toolkits, and they will almost all have POSTaggers. If your material is normal written English the POSTaggers will do well. If it's very colloquial (e.g. on Text Messages) or very unusual (e.g

Extract Noun phrase using stanford NLP

阅读更多关于 Extract Noun phrase using stanford NLP

问题 I am trying to find the Theme/Noun phrase from a sentence using Stanford NLP For eg: the sentence "the white tiger" I would love to get Theme/Nound phrase as : white tiger. For this I used pos tagger. My sample code is below. Result I am getting is "tiger" which is not correct. Sample code I used to run is public static void main(String[] args) throws IOException { Properties props = new Properties(); props.setProperty("annotators", "tokenize,ssplit,parse"); StanfordCoreNLP pipeline = new

How is polarity calculated for a sentence ??? (in sentiment analysis)

阅读更多关于 How is polarity calculated for a sentence ??? (in sentiment analysis)

问题 How is polarity of words in a statement are calculated....like "i am successful in accomplishing the task,but in vain" how each word is scored? (like - successful- 0.7 accomplishing- 0.8 but - -0.5 vain - - 0.8) how is it calculated ? how is each word given a value or score?? what is the thing that's going behind ? As i am doing sentiment analysis I have few thing to be clear so .that would be great if someone helps.thanks in advance 回答1: If you are willing to use Python and NLTK, then check

Identifying the entity in sentiment analysis using Lingpipe

阅读更多关于 Identifying the entity in sentiment analysis using Lingpipe

问题 I have implemented sentiment analysis using the sentiment analysis module of Lingpipe. I know that they use a Dynamic LR model for this. It just tells me if the test string is a positive sentiment or negative sentiment. What ideas could I use to determine the object for which the sentiment has been expressed? If the text is categorized as positive sentiment, I would like to get the object for which the sentiment has been expressed - this could be a movie name, product name or others. 回答1:

Estimating document polarity using R's qdap package without sentSplit

阅读更多关于 Estimating document polarity using R's qdap package without sentSplit

问题 I'd like to apply qdap 's polarity function to a vector of documents, each of which could contain multiple sentences, and obtain the corresponding polarity for each document. For example: library(qdap) polarity(DATA$state)$all$polarity # Results: [1] -0.8165 -0.4082 0.0000 -0.8944 0.0000 0.0000 0.0000 -0.5774 0.0000 [10] 0.4082 0.0000 Warning message: In polarity(DATA$state) : Some rows contain double punctuation. Suggested use of `sentSplit` function. This warning can't be ignored, as it

Python NLTK not sentiment calculate correct

阅读更多关于 Python NLTK not sentiment calculate correct

问题 I do have some positive and negative sentence. I want very simple to use Python NLTK to train a NaiveBayesClassifier for investigate sentiment for other sentence. I try to use this code, but my result is always positive. http://www.sjwhitworth.com/sentiment-analysis-in-python-using-nltk/ I am very new at python so there my be a mistake in the code when i copy it. import nltk import math import re import sys import os import codecs reload(sys) sys.setdefaultencoding('utf-8') from nltk.corpus

Testing the Keras sentiment classification with model.predict

阅读更多关于 Testing the Keras sentiment classification with model.predict

I have trained the imdb_lstm.py on my PC. Now I want to test the trained network by inputting some text of my own. How do I do it? Thank you! So what you basically need to do is as follows: Tokenize sequnces: convert the string into words (features): For example: "hello my name is georgio" to ["hello", "my", "name", "is", "georgio"]. Next, you want to remove stop words (check Google for what stop words are). This stage is optional, it may lead to faulty results but I think it worth a try. Stem your words (features), that way you'll reduce the number of features which will lead to a faster run.