Convert unicode code point to literal character in Go

偶尔善良 提交于 2019-12-17 17:06:36

问题


Let's say I have a text file like this.

\u0053
\u0075
\u006E

Is there a way I can convert that to this?

S
u
n

Currently, I'm using ioutil.ReadFile("data.txt"), but when I print the data, I get the unicode code points instead of the string literals. I realize this is the correct behavior for ReadFile, it's just not want I want.

I'm aiming for a substitution of the code points with their literal characters.


回答1:


You can use the strconv.Unquote() and strconv.UnquoteChar() functions to do the conversion.

One thing you should be aware of is that strconv.Unquote() can only unquote strings that are in quotes (e.g. start and end with a quote char " or a back quote char `), so we have to manually append that.

See this example:

lines := []string{
    `\u0053`,
    `\u0075`,
    `\u006E`,
}
fmt.Println(lines)

for i, v := range lines {
    var err error
    lines[i], err = strconv.Unquote(`"` + v + `"`)
    if err != nil {
        fmt.Println(err)
    }
}
fmt.Println(lines)

fmt.Println(strconv.Unquote(`"Go\u0070\x68\x65\x72"`))

Output (try it on the Go Playground):

[\u0053 \u0075 \u006E]
[S u n]
Gopher <nil>



回答2:


A slightly different approach is using strconv.ParseInt, this generates less garbage and uses less internal logic (Unquote does a lot of other checks) for parsing the lines:

for i, v := range lines {
    if len(v) != 6 {
        continue
    }

    if r, err := strconv.ParseInt(v[2:], 16, 32); err == nil {
        lines[i] = string(r)
    }
}

playground



来源:https://stackoverflow.com/questions/34126749/convert-unicode-code-point-to-literal-character-in-go

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!