问题
I am struggling to get Eclipse to read in Chinese characters correctly, and I am not sure where I may be going wrong.
Specifically, somewhere between reading in a string of Chinese (simplified or traditional) from the console and outputting it, it gets garbled. Even when outputting a large string of mixed text (English/Chinese characters), it appears to only alter the appearance of the Chinese characters.
I have cut it down to the following test example and explicitly annotated it with what I believe is happening at each stage - note that I am a student and would very much like to confirm my understanding (or otherwise) :)
public static void main(String[] args) {
try
{
boolean isRunning = true;
//Raw flow of input data from the console
InputStream inputStream = System.in;
//Allows you to read the stream, using either the default character encoding, else the specified encoding;
InputStreamReader inputStreamReader = new InputStreamReader(inputStream, "UTF-8");
//Adds functionality for converting the stream being read in, into Strings(?)
BufferedReader input_BufferedReader = new BufferedReader(inputStreamReader);
//Raw flow of outputdata to the console
OutputStream outputStream = System.out;
//Write a stream, from a given bit of text
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(outputStream, "UTF-8");
//Adds functionality to the base ability to write to a stream
BufferedWriter output_BufferedWriter = new BufferedWriter(outputStreamWriter);
while(isRunning) {
System.out.println();//force extra newline
System.out.print("> ");
//To read in a line of text (as a String):
String userInput_asString = input_BufferedReader.readLine();
//To output a line of text:
String outputToUser_fromString_englishFromCode = "foo"; //outputs correctly
output_BufferedWriter.write(outputToUser_fromString_englishFromCode);
output_BufferedWriter.flush();
System.out.println();//force extra newline
String outputToUser_fromString_ChineseFromCode = "之謂甚"; //outputs correctly
output_BufferedWriter.write(outputToUser_fromString_ChineseFromCode);
output_BufferedWriter.flush();
System.out.println();//force extra newline
String outputToUser_fromString_userSupplied = userInput_asString; //outputs correctly when given English text, garbled when given Chinese text
output_BufferedWriter.write(outputToUser_fromString_userSupplied);
output_BufferedWriter.flush();
System.out.println();//force extra newline
}
}
catch (Exception e) {
// TODO: handle exception
}
}
Sample output:
> 之謂甚
foo
之謂甚
之謂甚
> oaea
foo
之謂甚
oaea
> mixed input - English: fubar; Chinese: 之謂甚;
foo
之謂甚
mixed input - English: fubar; Chinese: 之謂甚;
>
What is seen on this Stack Overflow post matches exactly what I see in the Eclipse console and what is seen within the Eclipse debugger (when viewing/editing the variable values). Altering the variable values manually via the Eclipse debugger results in the code depending on that value to behave as I would normally expect them to, suggesting that it is how the text is read IN that is an issue.
I have tried many different combinations of scanners/buffered stream [reader|writer]s etc to read in and output, with and without explicit character types though this wasn't done particularly systematically and could easily have missed something.
I have tried to set the Eclipse environment to use UTF-8 wherever possible, but I guess I could have missed a place or two.. Note that the console will correctly output hard-coded Chinese characters.
Any assistance / guidance on this matter is greatly appreciated :)
回答1:
It looks like the console is not reading the input correctly. Here is a link that I believe describes your problem and work-rounds.
http://paranoid-engineering.blogspot.com/2008/05/getting-unicode-output-in-eclipse.html
Simple Answer : Try setting the environmental variable -Dfile.encoding=UTF-8 in your eclipse.ini. (Before enabling this for whole of eclipse, you could just try setting this in the debug configurtion for this program and see if it works )
The link has lot more suggestions
回答2:
Try this: In eclipse, right click your main class and click run as > run configurations. Then go to the common tab and change the encoding to UTF-8. That should work!
回答3:
This seems to be an encoding problem. There might be two problems here: 1. You haven't activated the compilers ability to read anything but ASCII characters, in your case you need to be able to read UTF-8 characters. 2. You may have deleted certain language packs? This is unlikely since you probably are able to write Chinese characters?
You should search around and learn how you can your IDE to compile the non-ASCII characters correctly. In python this is done in the code itself, I'm unsure how it is done in Java.
来源:https://stackoverflow.com/questions/13882378/java-console-not-reading-in-chinese-characters-correctly