I need to do sentiment analysis on some csv files containing tweets. I\'m using SentiWordNet to do the sentiment analysis.
I got the following piece of sample java c
First of all start by deleting all the "garbage" at the first of the file (which includes description, instruction etc..)
One possible usage is to change SWN3
an make the method extract
in it return a Double
:
public Double extract(String word)
{
Double total = new Double(0);
if(_dict.get(word+"#n") != null)
total = _dict.get(word+"#n") + total;
if(_dict.get(word+"#a") != null)
total = _dict.get(word+"#a") + total;
if(_dict.get(word+"#r") != null)
total = _dict.get(word+"#r") + total;
if(_dict.get(word+"#v") != null)
total = _dict.get(word+"#v") + total;
return total;
}
Then, giving a String that you want to tag, you can split it so it'll have only words (with no signs and unknown chars) and using the result returned from extract
method on each word, you can decide what is the average weight of the String:
String[] words = twit.split("\\s+");
double totalScore = 0, averageScore;
for(String word : words) {
word = word.replaceAll("([^a-zA-Z\\s])", "");
if (_sw.extract(word) == null)
continue;
totalScore += _sw.extract(word);
}
verageScore = totalScore;
if(averageScore>=0.75)
return "very positive";
else if(averageScore > 0.25 && averageScore<0.5)
return "positive";
else if(averageScore>=0.5)
return "positive";
else if(averageScore < 0 && averageScore>=-0.25)
return "negative";
else if(averageScore < -0.25 && averageScore>=-0.5)
return "negative";
else if(averageScore<=-0.75)
return "very negative";
return "neutral";
I found this way easier and it works fine for me.
UPDATE:
I changed _dict
to _dict = new HashMap<String, Double>();
So it will have a String
key and a Double
value.
So I replaced _dict.put(word, sent);
wish _dict.put(word, score);
for that you should write the main function, in that provide the path of csv, extract words from it. and then call extract function by sending the word and its pos.