I try to open a UTF-8 encoded .csv file that contains (traditional) Chinese characters in R. For some reason, R displays the information sometimes as Chinese characters, som
Not a bug, more a misunderstanding of the underlying type system conversions (the character
type and the factor
type) when constructing a data.frame
.
You could start first with data <-read.csv("mydata.csv", encoding="UTF-8", stringsAsFactors=FALSE)
which will make your Chinese characters to be of the character
type and so by printing them out you should see waht you are expecting.
@nograpes: similarly x=c('中華民族');x; y <- data.frame(x, stringsAsFactors=FALSE)
and everything should be ok.