Java String Unicode Value

前端 未结 2 938
没有蜡笔的小新
没有蜡笔的小新 2020-12-15 08:31

How can I get the unicode value of a string in java?

For example if the string is \"Hi\" I need something like \\uXXXX\\uXXXX

相关标签:
2条回答
  • 2020-12-15 08:46

    This method converts an arbitrary String to an ASCII-safe representation to be used in Java source code (or properties files, for example):

    public String escapeUnicode(String input) {
      StringBuilder b = new StringBuilder(input.length());
      Formatter f = new Formatter(b);
      for (char c : input.toCharArray()) {
        if (c < 128) {
          b.append(c);
        } else {
          f.format("\\u%04x", (int) c);
        }
      }
      return b.toString();
    }
    
    0 讨论(0)
  • 2020-12-15 08:54

    Some unicode characters span two Java chars. Quote from http://docs.oracle.com/javase/tutorial/i18n/text/unicode.html :

    The characters with values that are outside of the 16-bit range, and within the range from 0x10000 to 0x10FFFF, are called supplementary characters and are defined as a pair of char values.

    correct way to escape non-ascii:

    private static String escapeNonAscii(String str) {
    
      StringBuilder retStr = new StringBuilder();
      for(int i=0; i<str.length(); i++) {
        int cp = Character.codePointAt(str, i);
        int charCount = Character.charCount(cp);
        if (charCount > 1) {
          i += charCount - 1; // 2.
          if (i >= str.length()) {
            throw new IllegalArgumentException("truncated unexpectedly");
          }
        }
    
        if (cp < 128) {
          retStr.appendCodePoint(cp);
        } else {
          retStr.append(String.format("\\u%x", cp));
        }
      }
      return retStr.toString();
    }
    
    0 讨论(0)
提交回复
热议问题