How to write 3 bytes unicode literal in Java?

前端 未结 1 613
不思量自难忘°
不思量自难忘° 2020-12-11 21:19

I\'d like to write unicode literal U+10428 in Java. http://www.marathon-studios.com/unicode/U10428/Deseret_Small_Letter_Long_I

I tried with \'\\u10428\' and it doesn

相关标签:
1条回答
  • 2020-12-11 21:41

    Because Java went full-out unicode when people thought 64K are enough for everyone (Where did one hear such before?), they started out with UCS-2 and later upgraded to UTF-16.

    But they never bothered to add an escape sequence for unicode characters outside the BMP.

    Thus, your only recourse is manually recoding to a UTF-16 surrogate-pair and using two UTF-16 escapes.

    Your example codepoint U+10428 is "\uD801\uDC28".

    I used this site for the recoding: http://rishida.net/tools/conversion/

    Quote from the docs:

    3.10.5 String Literals

    A string literal consists of zero or more characters enclosed in double quotes. Characters may be represented by escape sequences (§3.10.6) - one escape sequence for characters in the range U+0000 to U+FFFF, two escape sequences for the UTF-16 surrogate code units of characters in the range U+010000 to U +10FFFF.

    0 讨论(0)
提交回复
热议问题