Trying to do some analysis of twitter data. Downloaded the tweets and created a corpus from the text of the tweets using the below
# Creating a Corpus
wim_co
I think the error is due to some "exotic" characters within the tweet messages, which the tm function cannot handle. I'v got the same error using tweets as a corpus source. Maybe the following workaround helps:
# Reading some tweet messages (here from a text file) into a vector
rawTweets <- readLines(con = "target_7_sample.txt", ok = TRUE, warn = FALSE, encoding = "utf-8")
# Convert the tweet text explicitly into utf-8
convTweets <- iconv(rawTweets, to = "utf-8")
# The above conversion leaves you with vector entries "NA", i.e. those tweets that can't be handled. Remove the "NA" entries with the following command:
tweets <- (convTweets[!is.na(convTweets)])
If the deletion of some tweets is not an issue for your solution (e.g. build a word cloud) then this approach may work, and you can proceed by calling the Corpus function of the tm package.
Regards--Albert