I have to handle this scenario in Java:
I\'m getting a request in XML form from a client with declared encoding=utf-8. Unfortunately it may contain not utf-8 charact
UTF-8 is an encoding; Unicode is a character set. But the GBP symbol is most definitely in the Unicode character set and therefore most certainly representable in UTF-8.
If you do in fact mean UTF-8, and you are actually trying to remove byte sequences that are not the valid encoding of a character in UTF-8, then...
CharsetDecoder utf8Decoder = Charset.forName("UTF-8").newDecoder();
utf8Decoder.onMalformedInput(CodingErrorAction.IGNORE);
utf8Decoder.onUnmappableCharacter(CodingErrorAction.IGNORE);
ByteBuffer bytes = ...;
CharBuffer parsed = utf8Decoder.decode(bytes);
...