问题
I'd like you to give me some advice in order to tackle this problem. At college I've been solving opinion mining tasks but with Twitter the approach is quite different. For example, I used an ensemble learning approach to classify users opinions about a certain Hotel in Spain. Of course, I was given a training set with positive and negative opinions and then I tested with the test set. But now, with twitter, I've found this kind of categorization very difficult.
Do I need to have a training set? and if the answer to this question is positive, don't you think twitter is so temporal so if I have that set, my performance on future topics will be very poor?
I was thinking in getting a dictionary (mainly adjectives) and cross my tweets with it and obtain a term-document matrix but I have no class assigned to any twitter. Also, positive adjectives and negative adjectives could vary depending on the topic and time. So, how to deal with this?
How to deal with the problem of languages? For instance, I'd like to study tweets written in English and those in Spanish, but separately.
Which programming languages do you suggest to do something like this? I've been trying with R packages like tm, twitteR.
回答1:
- Sure, I think the way sentiment is used will stay constant for a few months. worst case you relabel and retrain. Unsupervised learning has a shitty track record for industrial applications in my experience.
- You'll need some emotion/adj dictionary for sentiment stuff- there are some datasets out there but I forget where they are. I may have answered previous questions with better info.
- Just do English tweets, it's fairly easy to build a language classifier, but you want to start small, so take it easy on yourself
- Python (NLTK) if you want to do it easily in a small amount of code. Java has good NLP stuff, but Python and it's libraries are way more user friendly
回答2:
This site: https://sites.google.com/site/miningtwitter/questions/sentiment provides 3 ways to do sentiment analysis using R.
The twitter package is now updated to work with the new twitter API. I'd you download the source version of the package to avoid getting duplicated tweets.
I'm working on a spanish dictionary for opinion mining, and would publish somewhere accesible.
cheers!
回答3:
Sentiment Analysis will give only 3 results as said above - positive, negative and neutral. I found a tutorial on Twitter Sentiment analysis and it's quiet easy.
I found it here - https://www.ai-ml.tech/twitter-sentiment-analysis/
Only 3 dependencies, i downloaded and lesser code, done. Just go through it, you will get the solution.
来源:https://stackoverflow.com/questions/10416343/how-to-tackle-twitter-sentiment-analysis