Converting a \u escaped Unicode string to ASCII

后端 未结 7 888
感动是毒
感动是毒 2020-11-30 09:05

After reading all about iconv and Encoding, I am still confused.

I am scraping the source of a web page I have a string that looks like thi

相关标签:
7条回答
  • 2020-11-30 10:01

    I sympathise; I have struggled with R and unicode text in the past and not always successfully. If your data is in x then first try a global replace, something like this:

    x <- gsub("\u003D", "=>", x)
    

    I sometimes use a construction like

    lapply(x, utf8ToInt)
    

    to see where the high code points are e.g. anything over 150. This helps me locate problems caused by non-breaking spaces, for example, which seem to pop up every now and again.

    0 讨论(0)
提交回复
热议问题