Help the Java SAX parser to understand bad xml

后端 未结 3 1811
-上瘾入骨i
-上瘾入骨i 2020-12-19 13:24

I am parsing XML returned from a website but sadly it is slightly malformed. I am getting XML like:


         


        
3条回答
  •  旧巷少年郎
    2020-12-19 13:59

    One option to resolve your issue is, as Jim Garrison suggested, providing custom EntityResolver. However, it will fix only the concrete issue you described. If your XML will be malformed by e.g. not closed tags, EntityResolver would not fix it. In such case I'd recommend to use one of available HTML "purifiers" in order to fix HTML syntax into XML-valid form. In my opinion the best available is this one: http://nekohtml.sourceforge.net/

提交回复
热议问题