gsub in R with unicode replacement give different results under Windows compared with Unix?

后端 未结 2 1306
悲&欢浪女
悲&欢浪女 2020-12-04 00:34

Running the following commands in R under Mac or Linux produces the expected result, that is the greek letter beta:

gsub(\"\", \"\\u         


        
2条回答
  •  爱一瞬间的悲伤
    2020-12-04 01:03

    Just to elaborate on @MrFlick's solution, you have to set the encoding after each time a string is processed by gsub, as in:

    s <- "blahblah-blahblah-blahblah"
    # setting the encoding here and not in the while loop will not fix the problem
    {
    while(grepl('',s)){
        newVal <- gsub('^.*.*$','"\\\\u\\1"',s)
        newVal <- eval(parse(text=newVal))
        cat(newVal,'\n')
        s <- gsub('^(.*)(.*)$',
                  paste0('\\1',newVal,'\\2'),
                  s)
        # setting the encoding here fixes the cross platform differences
        Encoding(s) <- 'UTF-8'
    }
    cat(s,'\n')
    # setting the encoding here and not in the while loop will raise an error
    }
    Encoding(s)
    

提交回复
热议问题