Why do some Unicode characters display in matrices, but not data frames in R?

前端 未结 2 1304
萌比男神i
萌比男神i 2020-11-30 03:22

For at least some cases, Asian characters are printable if they are contained in a matrix, or a vector, but not in a data.frame. Here

相关标签:
2条回答
  • 2020-11-30 04:09

    I hate to answer my own question, but although the comments and answers helped, they weren't quite right. In Windows, it doesn't seem like you can set a generic 'UTF-8' locale. You can, however, set country-specific locales, which will work in this case:

    Sys.setlocale("LC_CTYPE", locale="Chinese")
    q2 # Works fine
    #  q
    #1 天
    

    But, it does make me wonder why exactly format seems to use the locale; I wonder if there is a way to have it ignore the locale in Windows. I also wonder if there is some generic UTF-8 locale that I don't know about on Windows.

    0 讨论(0)
  • 2020-11-30 04:10

    I just blogged about Unicode and R several days ago. I think your R editor is UTF-8 and this gives your illusion that R in your Windows handles UTF-8 characters.

    The short answer is when you want to process Unicode (Here, it is Chinese), don't use English Windows, use a Chinese version Windows or Linux which by default is UTF-8.

    Session info in my Ubuntu:

    > sessionInfo()
    R version 2.14.1 (2011-12-22)
    Platform: i686-pc-linux-gnu (32-bit)
    
    locale:
     [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
     [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=C                 LC_NAME=C                 
     [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
    
    0 讨论(0)
提交回复
热议问题