How do you convert unicode string to escapes in bash? [closed]

我与影子孤独终老i 提交于 2020-05-24 05:38:31

问题


I need a tool that will translate the unicode string into escape characters like \u0230.

For example,

echo ãçé | convert-unicode-tool
\u00e3\u00e7\u00e9

回答1:


All bash method -

echo ãçé |
   while read -n 1 u
   do [[ -n "$u" ]] && printf '\\u%04x' "'$u"
   done

That leading apostrophe is a printf formatting/interpretation guide.

From the GNU man page online:

If the leading character of a numeric argument is ‘"’ or ‘'’ then its value is the numeric value of the immediately following character. Any remaining characters are silently ignored if the POSIXLY_CORRECT environment variable is set; otherwise, a warning is printed. For example, ‘printf "%d" "'a"’ outputs ‘97’ on hosts that use the ASCII character set, since ‘a’ has the numeric value 97 in ASCII.

That lets us pass the character to printf for numeric interpretations such as %d or %03o, or here, %04x.

The [[ -n "$u" ]] is because there's a null trailing byte that will otherwise be appended as \u0000.

Output:

$:     echo ãçé |
>        while read -n 1 u
>        do [[ -n "$u" ]] && printf '\\u%04x' "'$u"
>        done
\u00e3\u00e7\u00e9

Without the null byte check -

$: echo ãçé | while read -n 1 u; do printf '\\u%04x' "'$u";done
\u00e3\u00e7\u00e9\u0000



回答2:


› echo -n ãçé | perl -C -e'print for map { sprintf "\\u%04x", ord } split //, readline'
\u00e3\u00e7\u00e9


来源:https://stackoverflow.com/questions/51307312/how-do-you-convert-unicode-string-to-escapes-in-bash

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!