remove non-UTF-8 characters from xml with declared encoding=utf-8 - Java

后端 未结 6 1568
再見小時候
再見小時候 2020-12-13 14:42

I have to handle this scenario in Java:

I\'m getting a request in XML form from a client with declared encoding=utf-8. Unfortunately it may contain not utf-8 charact

6条回答
  •  感动是毒
    2020-12-13 15:27

    Note that the first step should be that you ask the creator of the XML (which is most likely a home grown "just print data" XML generator) to ensure that their XML is correct before sending to you. The simplest possible test if they use Windows is to ask them to view it in Internet Explorer and see the parsing error at the first offending character.

    While they fix that, you can simply write a small program that change the header part to declare that the encoding is ISO-8859-1 instead:

    
    

    and leave the rest untouched.

提交回复
热议问题