R tm package invalid input in 'utf8towcs'

前端 未结 14 1398
逝去的感伤
逝去的感伤 2020-11-29 01:47

I\'m trying to use the tm package in R to perform some text analysis. I tied the following:

require(tm)
dataSet <- Corpus(DirSource(\'tmp/\'))
dataSet <         


        
14条回答
  •  栀梦
    栀梦 (楼主)
    2020-11-29 02:19

    I think it is clear by now that the problem is because of the emojis that tolower is not able to understand

    #to remove emojis
    dataSet <- iconv(dataSet, 'UTF-8', 'ASCII')
    

提交回复
热议问题