Unicode input in a console application in Java

为君一笑 提交于 2019-11-28 01:18:56

Some notes:

  • -Dfile.encoding=utf8 is not supported and may cause unintended side-effects:

The "file.encoding" property is not required by the J2SE platform specification; it's an internal detail of Sun's implementations and should not be examined or modified by user code. It's also intended to be read-only; it's technically impossible to support the setting of this property to arbitrary values on the command line or at any other time during program execution.

  • The Console class will detect and use the terminal encoding but doesn't support 65001 (UTF-8) on Windows - at least, it didn't the last time I tried it

I believe that the correct, documented way to use Unicode with cmd.exe is to use WriteConsoleW and ReadConsoleW.

I wrote a couple of blog posts when I was looking at this:

AlexR

NPE is throws when you are trying to call Arrays.toString(lineBytes), that means that lineBytes is null.

lineBytes holds value: line.getBytes(). getBytes() can return null only if UnsupportedEncodingException is throws inside.

It happens on windows because windows command prompt does not support unicode by default. This works on Ubuntu because its command prompt is fully unicode enabled. It works partially with eclipse because Eclipse's console window is a java component that supports unicode for input and does it for output with JAVA_TOOL_OPTIONS.

The bottom line is that you wish to configure windows command prompt to be able to use unicode characters. I saw several discussions on this topic. Please take a look on this one: Unicode characters in Windows command line - how?

I hope this will help you.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!