RMarkdown utf-8 error on mutliple operating systems

爱⌒轻易说出口 提交于 2019-12-10 13:22:54

问题


We have a problem using RMarkdown on multiple operating systems.

Initially, an .Rmd file is created on a Linux system (Ubuntu 12.04 LTS) and then pushed to a GitHub repo.

It can be compiled ("knitted") without problems on this system.

It is then pulled on a Windows 7 machine with RStudio installed.

There, when trying to compile, the following error shows up:

Error in yaml::yaml.load(front_matter) : 
  Reader error: invalid leading UTF-8 octet: #FC at 66
Calls: <Anonymous> -> parse_yaml_front_matter -> <Anonymous> -> .Call
Execution halted
  1. When creating another .Rmd file on the Windows system, it works flawlessly.
  2. When creating another .Rmd file on the Windows system, and copying everything but the first few lines of the "problematic" file to the other .Rmd file, and compiling this file, it works flawlessly.

I compared both files in HEX (in Sublime) on both operating systems: They are EXACTLY the same.

Has somebody else seen that error before?

Update: It seems as if a German Umlaut ("ü") is causing the problem, as its UTF-8 "Escaped Unicode" is \uFC, according to http://www.endmemo.com/unicode/unicodeconverter.php

In general, it seems that Unicode is not correctly recognized by either R, RStudio or knitr on Windows. When I type in some Umlauts in a new .Rmd file, and knit it, I get output such as "öää". In RStudio > Tools > Global options, I set the Default text encoding to "UTF-8". And I also did that for R, in the RProfile.site file (options(encoding="UTF-8")).

Update 2: library(rmarkdown); sessionInfo() gives

R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Switzerland.1252  LC_CTYPE=German_Switzerland.1252    LC_MONETARY=German_Switzerland.1252
[4] LC_NUMERIC=C                        LC_TIME=German_Switzerland.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rmarkdown_0.4.2

loaded via a namespace (and not attached):
[1] digest_0.6.8    htmltools_0.2.6 tools_3.1.2    

on Windows 7, whereas, on Ubuntu, it is:

R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rmarkdown_0.3.10

loaded via a namespace (and not attached):
[1] digest_0.6.8    htmltools_0.2.6 tools_3.1.2   

I already suspect the problem to be the diverging locale... how do I fix this?


回答1:


I am extremely late to this, but I solved the issue by changing the options encoding back to "native":

options(encoding="native")

And changing the default windows encoding to UTF-8 (which opened the pandora box of a non-negligible number of other issues related to the encoding of other programs; so, treat with caution).



来源:https://stackoverflow.com/questions/27982566/rmarkdown-utf-8-error-on-mutliple-operating-systems

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!