Stanford nlp for python

前端未结

关注

 9  1241

All I want to do is find the sentiment (positive/negative/neutral) of any given string. On researching I came across Stanford NLP. But sadly its in Java. Any ideas on how ca

Use stanfordcore-nlp python library

stanford-corenlp is a really good wrapper on top of the stanfordcore-nlp to use it in python.

wget http://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip

Usage

# Simple usage
from stanfordcorenlp import StanfordCoreNLP

nlp = StanfordCoreNLP('/Users/name/stanford-corenlp-full-2018-10-05')

sentence = 'Guangdong University of Foreign Studies is located in Guangzhou.'
print('Tokenize:', nlp.word_tokenize(sentence))
print('Part of Speech:', nlp.pos_tag(sentence))
print('Named Entities:', nlp.ner(sentence))
print('Constituency Parsing:', nlp.parse(sentence))
print('Dependency Parsing:', nlp.dependency_parse(sentence))

nlp.close() # Do not forget to close! The backend server will consume a lot memory.

More info

0 讨论(0)

伪装坚强ぢ

2020-11-29 18:14

I would suggest using the TextBlob library. A sample implementation goes like this:

from textblob import TextBlob
def sentiment(message):
    # create TextBlob object of passed tweet text
    analysis = TextBlob(message)
    # set sentiment
    return (analysis.sentiment.polarity)

0 讨论(0)

情话喂你

2020-11-29 18:14

There is a very new progress on this issue:

Now you can use stanfordnlp package inside the python:

From the README:

>>> import stanfordnlp
>>> stanfordnlp.download('en')   # This downloads the English models for the neural pipeline
>>> nlp = stanfordnlp.Pipeline() # This sets up a default neural pipeline in English
>>> doc = nlp("Barack Obama was born in Hawaii.  He was elected president in 2008.")
>>> doc.sentences[0].print_dependencies()

0 讨论(0)

灰色年华

2020-11-29 18:23
Textblob is a great package for sentimental analysis written in Python. You can have the docs here . Sentimental analysis of any given sentence is carried out by inspecting words and their corresponding emotional score (sentiment). You can start with
```
$ pip install -U textblob
$ python -m textblob.download_corpora
```
First pip install command will give you latest version of textblob installed in your (virtualenv) system since you pass -U will upgrade the pip package its latest available version . And the next will download all the data required, thecorpus .
0 讨论(0)
发布评论:

提交评论
- 加载中...
予麋鹿

2020-11-29 18:26
Use py-corenlp

Download Stanford CoreNLP

The latest version at this time (2020-05-25) is 4.0.0:
```
wget https://nlp.stanford.edu/software/stanford-corenlp-4.0.0.zip https://nlp.stanford.edu/software/stanford-corenlp-4.0.0-models-english.jar
```
If you do not have wget, you probably have curl:
```
curl https://nlp.stanford.edu/software/stanford-corenlp-4.0.0.zip -O https://nlp.stanford.edu/software/stanford-corenlp-4.0.0-models-english.jar -O
```
If all else fails, use the browser ;-)

Install the package
```
unzip stanford-corenlp-4.0.0.zip
mv stanford-corenlp-4.0.0-models-english.jar stanford-corenlp-4.0.0
```
Start the server
```
cd stanford-corenlp-4.0.0
java -mx5g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -timeout 10000
```
Notes:
1. timeout is in milliseconds, I set it to 10 sec above. You should increase it if you pass huge blobs to the server.
2. There are more options, you can list them with --help.
3. -mx5g should allocate enough memory, but YMMV and you may need to modify the option if your box is underpowered.
Install the python package
```
pip install pycorenlp
```
(See also the official list).

Use it
```
from pycorenlp import StanfordCoreNLP

nlp = StanfordCoreNLP('http://localhost:9000')
res = nlp.annotate("I love you. I hate him. You are nice. He is dumb",
                   properties={
                       'annotators': 'sentiment',
                       'outputFormat': 'json',
                       'timeout': 1000,
                   })
for s in res["sentences"]:
    print("%d: '%s': %s %s" % (
        s["index"],
        " ".join([t["word"] for t in s["tokens"]]),
        s["sentimentValue"], s["sentiment"]))
```
and you will get:
```
0: 'I love you .': 3 Positive
1: 'I hate him .': 1 Negative
2: 'You are nice .': 3 Positive
3: 'He is dumb': 1 Negative
```
Notes
1. You pass the whole text to the server and it splits it into sentences. It also splits sentences into tokens.
2. The sentiment is ascribed to each sentence, not the whole text. The mean sentimentValue across sentences can be used to estimate the sentiment of the whole text.
3. The average sentiment of a sentence is between Neutral (2) and Negative (1), the range is from VeryNegative (0) to VeryPositive (4) which appear to be quite rare.
4. You can stop the server either by typing Ctrl-C at the terminal you started it from or using the shell command kill $(lsof -ti tcp:9000). 9000 is the default port, you can change it using the -port option when starting the server.
5. Increase timeout (in milliseconds) in server or client if you get timeout errors.
6. sentiment is just one annotator, there are many more, and you can request several, separating them by comma: 'annotators': 'sentiment,lemma'.
7. Beware that the sentiment model is somewhat idiosyncratic (e.g., the result is different depending on whether you mention David or Bill).
PS. I cannot believe that I added a 9th answer, but, I guess, I had to, since none of the existing answers helped me (some of the 8 previous answers have now been deleted, some others have been converted to comments).
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页

Stanford nlp for python

stanford_corenlp_py

Use stanfordcore-nlp python library

Usage

Use py-corenlp

Download Stanford CoreNLP

Install the package

Start the server

Install the python package

Use it

Notes