Java charAt used with characters that have two code units

前端 未结 4 736
梦毁少年i
梦毁少年i 2020-12-03 14:30

From Core Java, vol. 1, 9th ed., p. 69:

The character ℤ requires two code units in the UTF-16 encoding. Calling

String sentence =         


        
4条回答
  •  夕颜
    夕颜 (楼主)
    2020-12-03 14:49

    According to the documentation String is represented internally as utf-16, so charAt() is giving you two code points. If you are interested in seeing the individual code points you can use this code (from this answer):

    final int length = sentence.length();
    for (int offset = 0; offset < length; ) {
       final int codepoint = sentence.codePointAt(offset);
    
       // do something with the codepoint
    
       offset += Character.charCount(codepoint);
    }
    

提交回复
热议问题