After reading all about iconv and Encoding, I am still confused.
I am scraping the source of a web page I have a string that looks like thi
Although I have accepted Hong ooi's answer, I can't help thinking parse and eval is a heavyweight solution. Also, as pointed out, it is not secure, although for my application I can be confident that I will not get dangerous quotes.
So, I have devised an alternative, somewhat brutal, approach:
udecode <- function(string){
uconv <- function(chars) intToUtf8(strtoi(chars, 16L))
ufilter <- function(string) {
if (substr(string, 1, 1)=="|") uconv(substr(string, 2, 5)) else string
}
string <- gsub("\\\\u([[:xdigit:]]{4})", ",|\\1,", string, perl=TRUE)
strings <- unlist(strsplit(string, ","))
string <- paste(sapply(strings, ufilter), collapse='')
return(string)
}
Any simplifications welcomed!