I\'m investigating an issue where a username with Latin-1 character is introduced in a login form. The username contains character á. I investigate the server part where I h
In your example your form is sending UTF-8 char for 'á' to Tomcat utilizing % encoding (so over the wire it is %C3%A1). However Tomcat will interpret it as Latin1 which is the default encoding for POST.
So Tomcat will store C3A1 as 'á' internally since C3 is 'Ã' and A1 is '¡' in Latin1 encoding.
When you asks for username.getBytes() it will create an UTF-8 encoded byte array, so it looks up the two characters of 'á' in the UTF-8 character set which is C383 C2A1.
The FAQ that describes this in detail and the proposed solution: http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q3
Change the Valve of the FormAuthenticator in server.xml to specify characterEncoding="UTF-8"
<Context path="/YourSercureApp">
<Valve
className="org.apache.catalina.authenticator.FormAuthenticator"
disableProxyCaching="false"
characterEncoding="UTF-8" />
</Context>