I am trying to parse an XML file with but ran into an error message invalid byte 2 of 2-byte UTF-8 sequence
invalid byte 2 of 2-byte UTF-8 sequence
You could try to change default character encoding used by String.getBytes() to utf-8. Use VM option -Dfile.encoding=utf-8.