Convert International String to \u Codes in java

后端 未结 12 2027
离开以前
离开以前 2020-11-29 02:53

How can I convert an international (e.g. Russian) String to \\u numbers (unicode numbers)
e.g. \\u041e\\u041a for OK ?

相关标签:
12条回答
  • 2020-11-29 03:18

    this type name is Decode/Unescape Unicode. this site link online convertor.

    0 讨论(0)
  • 2020-11-29 03:20

    You could use escapeJavaStyleString from org.apache.commons.lang.StringEscapeUtils.

    0 讨论(0)
  • 2020-11-29 03:23

    In case you need this to write a .properties file you can just add the Strings into a Properties object and then save it to a file. It will take care for the conversion.

    0 讨论(0)
  • 2020-11-29 03:24

    I also had this problem. I had some Portuguese text with some special characters, but these characters where already in unicode format (ex.: \u00e3).

    So I want to convert S\u00e3o to São.

    I did it using the apache commons StringEscapeUtils. As @sorin-sbarnea said. Can be downloaded here.

    Use the method unescapeJava, like this:

    String text = "S\u00e3o"
    text = StringEscapeUtils.unescapeJava(text);
    System.out.println("text " + text);
    

    (There is also the method escapeJava, but this one puts the unicode characters in the string.)

    If any one knows a solution on pure Java, please tell us.

    0 讨论(0)
  • 2020-11-29 03:31

    Just some basic Methods for that (inspired from native2ascii tool):

    /**
     * Encode a String like äöü to \u00e4\u00f6\u00fc
     * 
     * @param text
     * @return
     */
    public String native2ascii(String text) {
        if (text == null)
            return text;
        StringBuilder sb = new StringBuilder();
        for (char ch : text.toCharArray()) {
            sb.append(native2ascii(ch));
        }
        return sb.toString();
    }
    
    /**
     * Encode a Character like ä to \u00e4
     * 
     * @param ch
     * @return
     */
    public String native2ascii(char ch) {
        if (ch > '\u007f') {
            StringBuilder sb = new StringBuilder();
            // write \uffffdd
            sb.append("\\u");
            StringBuffer hex = new StringBuffer(Integer.toHexString(ch));
            hex.reverse();
            int length = 4 - hex.length();
            for (int j = 0; j < length; j++) {
                hex.append('0');
            }
            for (int j = 0; j < 4; j++) {
                sb.append(hex.charAt(3 - j));
            }
            return sb.toString();
        } else {
            return Character.toString(ch);
        }
    }
    
    0 讨论(0)
  • 2020-11-29 03:34

    Apache commons StringEscapeUtils.escapeEcmaScript(String) returns a string with unicode characters escaped using the \u notation.

    "Art of Beer                                                                     
    0 讨论(0)
提交回复
热议问题