i am converting a project from Ant to Maven and i\'m having problems with a specific unit test which deals with UTF-8 characters. The problem is about the following String:<
When debugging Unicode problems, make sure you convert everything to ASCII so you can read and understand what is inside of a String without guesswork. This means you should use, for example, StringEscapeUtils from commons-lang3 to turn ä into \u00e4. That way, you can be sure that you see ? because the console can't print it. And you can distinguish " " (\u0020) from " " (\u00a0)
In the test case, check the escaped version of the inputs as early as possible to make sure the data is actually what you expect.
So the code above should be:
assertEquals("\u010d\u00e4\u....", escape(l_string));
Make sure you use the correct encoding for file I/O. Never use the default encoding of Java, always use InputStreamReader/OutputStreamWriter and specify the encoding to use.
The POM looks correct. Run mvn with -X to make sure it picks up the correct options and runs the Java compiler using the correct options. mvn help:effective-pom might also help.
Disassemble the class file to check the strings. Java will use ? to denote that it couldn't read something.
If you get the ? from System.out.println( ">>> " + l_string );, this means the code wasn't compiled with UTF-8 or that the source file was maybe saved with another Unicode encoding (UTF-16 or similar).
Another source of problems could be the properties file. Make sure it was saved with ISO-8859-1 and that it wasn't modified by the compilation process.
Make sure Maven actually compiles your file. Use mvn clean to force a full-recompile.