How can I convert an international (e.g. Russian) String to \\u
numbers (unicode numbers)
e.g. \\u041e\\u041a
for OK
?
this type name is Decode/Unescape Unicode. this site link online convertor.
You could use escapeJavaStyleString
from org.apache.commons.lang.StringEscapeUtils
.
In case you need this to write a .properties
file you can just add the Strings into a Properties object and then save it to a file. It will take care for the conversion.
I also had this problem. I had some Portuguese text with some special characters, but these characters where already in unicode format (ex.: \u00e3
).
So I want to convert S\u00e3o
to São
.
I did it using the apache commons StringEscapeUtils. As @sorin-sbarnea said. Can be downloaded here.
Use the method unescapeJava
, like this:
String text = "S\u00e3o"
text = StringEscapeUtils.unescapeJava(text);
System.out.println("text " + text);
(There is also the method escapeJava
, but this one puts the unicode characters in the string.)
If any one knows a solution on pure Java, please tell us.
Just some basic Methods for that (inspired from native2ascii tool):
/**
* Encode a String like äöü to \u00e4\u00f6\u00fc
*
* @param text
* @return
*/
public String native2ascii(String text) {
if (text == null)
return text;
StringBuilder sb = new StringBuilder();
for (char ch : text.toCharArray()) {
sb.append(native2ascii(ch));
}
return sb.toString();
}
/**
* Encode a Character like ä to \u00e4
*
* @param ch
* @return
*/
public String native2ascii(char ch) {
if (ch > '\u007f') {
StringBuilder sb = new StringBuilder();
// write \uffffdd
sb.append("\\u");
StringBuffer hex = new StringBuffer(Integer.toHexString(ch));
hex.reverse();
int length = 4 - hex.length();
for (int j = 0; j < length; j++) {
hex.append('0');
}
for (int j = 0; j < 4; j++) {
sb.append(hex.charAt(3 - j));
}
return sb.toString();
} else {
return Character.toString(ch);
}
}
Apache commons StringEscapeUtils.escapeEcmaScript(String)
returns a string with unicode characters escaped using the \u
notation.
"Art of Beer