Emoticons in Twitter Sentiment Analysis in r

后端 未结 2 1994
小鲜肉
小鲜肉 2020-12-03 05:35

How do I handle/get rid of emoticons so that I can sort tweets for sentiment analysis?

Getting: Error in sort.list(y) : invalid input

Thanks

and

2条回答
  •  一整个雨季
    2020-12-03 05:59

    You can use regular expression to detect non-alphabet characters and remove them. Sample code:

    rmNonAlphabet <- function(str) {
      words <- unlist(strsplit(str, " "))
      in.alphabet <- grep(words, pattern = "[a-z|0-9]", ignore.case = T)
      nice.str <- paste(words[in.alphabet], collapse = " ")
      nice.str
    }
    

提交回复
热议问题