sentiment-analysis

Google Cloud Natural Language API - How is document magnitude calculated?

牧云@^-^@ 提交于 2019-12-11 18:22:48
问题 I am currently working with the Google Cloud Natural Language API and need to know how the magnitude value for a whole document (consisting of several sentences) is calculated? For the document sentiment score the average of the scores for each sentence is taken. For the document magnitude I would have assumed that it's calculated by taking the absolute sum of the individual magnitude values for each sentence. But after testing some paragraphs it's clear that it's not the correct way to

Sentiment analysis R syuzhet NRC Word-Emotion Association Lexicon

半城伤御伤魂 提交于 2019-12-11 16:39:13
问题 How do you find the associated words to the eight basic emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) (NRC Word-Emotion Association Lexicon) when using get_nrc_sentiment of the using the syuzhet package? a <- c("I hate going to work it is dull","I love going to work it is fun") a_corpus = Corpus(VectorSource(a)) a_tm <- TermDocumentMatrix(a_corpus) a_tmx <- as.matrix(a_tm) a_df<-data.frame(text=unlist(sapply(a, `[`)), stringsAsFactors=F) a_sent<-get_nrc

R count how often words from a list appear in a sentence

旧街凉风 提交于 2019-12-11 12:16:31
问题 Currently participating in a MOOC and trying my hand at some sentiment analysis, but having trouble with the R code. What I have is a list of bad words and a list of good words. For instance my bad words are c("dent", "broken", "wear", "cracked") ect. I have a list of descriptions in my data frame, what I want to do is get a count on how many of my bad words appear in the list and how many of my good words appear for each row. for instance suppose this is my data frame desc = c("this screen

NLP- Sentiment Processing for Junk Data takes time

柔情痞子 提交于 2019-12-11 11:24:15
问题 I am trying to find the Sentiment for the input text. This test is a junk sentence and when I tried to find the Sentiment the Annotation to parse the sentence is taking around 30 seconds. For normal text it takes less than a second. If i need to process around millions of data it will add up the time to process. Any solution to this. String text = "Nm n n 4 n n bkj nun4hmnun Onn njnb hm5bn nm55m nbbh n mnrrnut but n rym4n nbn 4nn65 m nun m n nn nun 4nm 5 gm n my b bb b b rtmrt55tmmm5tttn b b

Set of rules for textual analysis - Natural language processing

巧了我就是萌 提交于 2019-12-11 10:34:06
问题 Does there exist a guide with a set of rules for textual analysis / natural language processing ? Do you have some specific developed package (e.g. in Python) for textual sentiment analysis? Here is the application I am faced with: Let's say I have two dictionaries, A and B. A contains "negative" words, and B contains "positive" words. What I can do is count the negative and the positive number of words. This created some issues, such as the following: let's suppose that " exceptionally " is

Is the a way of getting the degree of positiveness or negativeness when using Logistic Regression for sentiment analysis

你说的曾经没有我的故事 提交于 2019-12-11 06:17:22
问题 I have been following an example about Sentiment Analysis using Logistic Regression, in which prediction result only gives a 1 or 0 to give positive or negative sentiment respectively. My challenge is that i want to classify a given user input into one of the four classes (very good, good, average, poor) but my prediction result every time is 1 or 0. Below is my code sample so far from sklearn.feature_extraction.text import CountVectorizer from vaderSentiment.vaderSentiment import

What does sentiwordnet 3.0 result signify?

随声附和 提交于 2019-12-11 05:39:24
问题 What does the result of sentiwordnett signify?. If the value given for good is 0.6337,does it mean the probability that the word good is positive is 0.6337 or does it mean the word good has a weightage of 0.6337?if it is the weightage given,then value of extraordinary should be greater than good but value given to extraordinary is only 0.272727 . and the format of sentiwordnet is POS ID PosScore NegScore SynsetTerms Gloss How exactly is the final result caluculated? (using the demo code http:

No freeze attribute when using Dataset module in Python

╄→尐↘猪︶ㄣ 提交于 2019-12-11 05:00:56
问题 I'm currently trying to implement a form of twitter data analysis. I've already got a code up and running to pull data using the Streaming API, and all I have to do is save the data in a csv file. result = db[settings.TABLE_NAME].all() dataset.freeze(result, format='csv', filename=settings.CSV_NAME) From what I saw in the documentation, this should be the right way of declaring this. I've defined Table.Name and CSV_Name in another file settings.py. When running python dump.py , it gives me

Issue with Spark MLLib that causes probability and prediction to be the same for everything

天大地大妈咪最大 提交于 2019-12-11 02:29:47
问题 I'm learning how to use Machine Learning with Spark MLLib with the purpose of doing Sentiment Analysis of Tweets. I got a Sentiment Analysis dataset from here: http://thinknook.com/wp-content/uploads/2012/09/Sentiment-Analysis-Dataset.zip That dataset contains 1 million of tweets classified as Positive or Negative. The second column of this dataset contains the sentiment and the fourth column contains the tweet. This is my current PySpark code: import csv from pyspark.sql import Row from

R sentiment analysis; 'lexicon' not found; 'sentiments' corrupted?

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-10 18:50:53
问题 I am trying to follow this on-line tutorial on sentiment analysis. The code: new_sentiments <- sentiments %>% #From the tidytext package filter(lexicon != "loughran") %>% #Remove the finance lexicon mutate( sentiment = ifelse(lexicon == "AFINN" & score >= 0, "positive", ifelse(lexicon == "AFINN" & score < 0, "negative", sentiment))) %>% group_by(lexicon) %>% mutate(words_in_lexicon = n_distinct(word)) %>% ungroup() Generates the error: >Error in filter_impl(.data, quo) : >Evaluation error: