All I want to do is find the sentiment (positive/negative/neutral) of any given string. On researching I came across Stanford NLP. But sadly its in Java. Any ideas on how ca
I am facing the same problem : maybe a solution with stanford_corenlp_py that uses Py4j
as pointed out by @roopalgarg.
stanford_corenlp_py
This repo provides a Python interface for calling the "sentiment" and "entitymentions" annotators of Stanford's CoreNLP Java package, current as of v. 3.5.1. It uses py4j to interact with the JVM; as such, in order to run a script like scripts/runGateway.py, you must first compile and run the Java classes creating the JVM gateway.
stanford-corenlp is a really good wrapper on top of the stanfordcore-nlp to use it in python.
wget http://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip
# Simple usage
from stanfordcorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP('/Users/name/stanford-corenlp-full-2018-10-05')
sentence = 'Guangdong University of Foreign Studies is located in Guangzhou.'
print('Tokenize:', nlp.word_tokenize(sentence))
print('Part of Speech:', nlp.pos_tag(sentence))
print('Named Entities:', nlp.ner(sentence))
print('Constituency Parsing:', nlp.parse(sentence))
print('Dependency Parsing:', nlp.dependency_parse(sentence))
nlp.close() # Do not forget to close! The backend server will consume a lot memory.
More info
I would suggest using the TextBlob library. A sample implementation goes like this:
from textblob import TextBlob
def sentiment(message):
# create TextBlob object of passed tweet text
analysis = TextBlob(message)
# set sentiment
return (analysis.sentiment.polarity)
There is a very new progress on this issue:
Now you can use stanfordnlp
package inside the python:
From the README:
>>> import stanfordnlp
>>> stanfordnlp.download('en') # This downloads the English models for the neural pipeline
>>> nlp = stanfordnlp.Pipeline() # This sets up a default neural pipeline in English
>>> doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.")
>>> doc.sentences[0].print_dependencies()
Textblob
is a great package for sentimental analysis written in Python
. You can have the docs here . Sentimental analysis of any given sentence is carried out by inspecting words and their corresponding emotional score (sentiment). You can start with
$ pip install -U textblob
$ python -m textblob.download_corpora
First pip install command will give you latest version of textblob installed in your (virtualenv
) system since you pass -U will upgrade the pip package its latest available version
. And the next will download all the data required, thecorpus
.
The latest version at this time (2020-05-25) is 4.0.0:
wget https://nlp.stanford.edu/software/stanford-corenlp-4.0.0.zip https://nlp.stanford.edu/software/stanford-corenlp-4.0.0-models-english.jar
If you do not have wget, you probably have curl:
curl https://nlp.stanford.edu/software/stanford-corenlp-4.0.0.zip -O https://nlp.stanford.edu/software/stanford-corenlp-4.0.0-models-english.jar -O
If all else fails, use the browser ;-)
unzip stanford-corenlp-4.0.0.zip
mv stanford-corenlp-4.0.0-models-english.jar stanford-corenlp-4.0.0
cd stanford-corenlp-4.0.0
java -mx5g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -timeout 10000
Notes:
timeout
is in milliseconds, I set it to 10 sec above.
You should increase it if you pass huge blobs to the server.--help
.-mx5g
should allocate enough memory, but YMMV and you may need to modify the option if your box is underpowered.pip install pycorenlp
(See also the official list).
from pycorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP('http://localhost:9000')
res = nlp.annotate("I love you. I hate him. You are nice. He is dumb",
properties={
'annotators': 'sentiment',
'outputFormat': 'json',
'timeout': 1000,
})
for s in res["sentences"]:
print("%d: '%s': %s %s" % (
s["index"],
" ".join([t["word"] for t in s["tokens"]]),
s["sentimentValue"], s["sentiment"]))
and you will get:
0: 'I love you .': 3 Positive
1: 'I hate him .': 1 Negative
2: 'You are nice .': 3 Positive
3: 'He is dumb': 1 Negative
sentimentValue
across sentences can be used to estimate the sentiment of the whole text.Neutral
(2) and Negative
(1), the range is from VeryNegative
(0) to VeryPositive
(4) which appear to be quite rare.kill $(lsof -ti tcp:9000)
. 9000
is the default port, you can change it using the -port
option when starting the server.timeout
(in milliseconds) in server or client if you get timeout errors.sentiment
is just one annotator, there are many more, and you can request several, separating them by comma: 'annotators': 'sentiment,lemma'
.PS. I cannot believe that I added a 9th answer, but, I guess, I had to, since none of the existing answers helped me (some of the 8 previous answers have now been deleted, some others have been converted to comments).