I\'m looking for general a strategy/advice on how to handle invalid UTF-8 input from users.
Even though my webapp uses UTF-8, somehow some users enter invalid chara
Receiving invalid characters from your web app might have to do with the character sets assumed for HTML forms. You can specify which character set to use for forms with the accept-charset attribute:
You also might want to take a look at similar questions in StackOverflow for pointers on how to handle invalid characters, e.g. those in the column to the right, but I think that signaling an error to the user is better than trying to clean up those invalid characters which cause unexpected loss of significant data or unexpected change of your user's inputs.