handling special characters e.g. accents in R

后端 未结 4 1953
余生分开走
余生分开走 2020-12-14 03:59

I am doing some web scraping of names into a dataframe

For a name such as \"Tomáš Rosický, I get a result \"Tomáš Rosický\"

I tried

Enco         


        
相关标签:
4条回答
  • 2020-12-14 04:23

    A way to export accents correctly:

    enc2utf8(as(dataframe$columnname, "character"))
    
    0 讨论(0)
  • 2020-12-14 04:26

    You've read in a page encoded in UTF-8. if x is your column of names, use Encoding(x) <- "UTF-8".

    0 讨论(0)
  • 2020-12-14 04:31

    To do a correct read of the file use the scan function:

    namb <- scan(file='g:/testcodering.txt', fileEncoding='UTF-8',
    what=character(), sep='\n', allowEscapes=T)
    cat(namb)
    

    This also works:

    namc <- readLines(con <- file('g:/testcodering.txt', "r",
    encoding='UTF-8')); close(con)
    cat(namc)
    

    This will read the file with the correct accents

    0 讨论(0)
  • 2020-12-14 04:32

    You should use this:

    df$colname <- iconv(df$colname, from="UTF-8", to="LATIN1")
    
    0 讨论(0)
提交回复
热议问题