Why does Integer.parseInt(“\uD835\uDFE8”) fail?

我怕爱的太早我们不能终老 提交于 2019-12-08 08:57:50

问题


I was under the impression that java supports unicode characters. I made this test and sadly found that it fails. The question is why? Is it a bug or somewhere documented?

// MATHEMATICAL SANS-SERIF "𝟨"
String unicodeNum6 = "\uD835\uDFE8";
int codePoint6 = unicodeNum6.codePointAt(0);    
int val6 = Character.getNumericValue(codePoint6);
System.out.println("unicodeNum6 = "+ unicodeNum6
    + ", codePoint6 = "+ codePoint6+ ", val6 = "+val6);
int unicodeNum6Int = Integer.parseInt(unicodeNum6);

This fails with a Exception in thread "main" java.lang.NumberFormatException: For input string: "𝟨"

Unexpected I think, since the println works and prints the expected line:

unicodeNum6 = 𝟨, codePoint6 = 120808, val6 = 6

So Java perfectly knows the numerical value of the unicode character but does not use it in parseInt.

Can someone give a good reason why it should fail?


回答1:


It's not bug, the behaviour is documented. According to the documentation for parseInt(String s, int radix) (emphasis mine)

The characters in the string must all be digits of the specified radix (as determined by whether Character.digit(char, int) returns a nonnegative value), except that the first character may be an ASCII minus sign '-' ('\u002D') to indicate a negative value or an ASCII plus sign '+' ('\u002B') to indicate a positive value

If you try :

int aa = Character.digit('\uD835', 10);
int bb = Character.digit('\uDFE8', 10);

You'll see that both return -1.
Mind you, Integer.parseInt(unicodeNum6); will just call Integer.parseInt(unicodeNum6, 10);



来源:https://stackoverflow.com/questions/32813135/why-does-integer-parseint-ud835-udfe8-fail

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!