I\'m trying to use the tm package in R to perform some text analysis. I tied the following:
require(tm) dataSet <- Corpus(DirSource(\'tmp/\')) dataSet <
This is from the tm faq:
it will replace non-convertible bytes in yourCorpus with strings showing their hex codes.
I hope this helps, for me it does.
tm_map(yourCorpus, function(x) iconv(enc2utf8(x), sub = "byte"))
http://tm.r-forge.r-project.org/faq.html