Parsing invalid ampersands with Android's XmlPullParsers

霸气de小男生 提交于 2019-11-28 03:34:58

问题


I am writing a little screen-scraping app that consumes some XHTML - it goes without saying that the XHTML is invalid: ampersands aren't escaped as &.

I am using Android's XmlPullParser and it spews out the following error upon the incorrectly encoded value:

org.xmlpull.v1.XmlPullParserException: unterminated entity ref 
(position:START_TAG <a href='/Fahrinfo/bin/query.bin/dox?ld=0.1&n=3&i=9c.0323581.1266265347&rt=0&vcra'>
@55:134 in java.io.InputStreamReader@43b1ef70) 

How do I get around this? I have thought about the following solutions:

  1. Wrapping the InputStream in another one that replaces the ampersands with entity refs
  2. Configuring the Parser so it magically accepts the incorrect markup

Which ones is likely to be more successful?


回答1:


I would go with your first option, replacing the ampersands seems more of a fit solution than the other. The second option seems more of a hack to get it to work by accepting incorrect markup.




回答2:


I was stuck on this for about an hour before figuring out that in my case it was the "&" that couldn't be resolved by the XML PULL PARSER, so i found the solution. So Here is a snippet of code which totally fix it.

void ParsingActivity(String r) {
    try {
        parserCreator = XmlPullParserFactory.newInstance();
        parser = parserCreator.newPullParser();
        // Here we give our file object in the form of a stream to the
        // parser.
        parser.setInput(new StringReader(r.replaceAll("&", "&amp;")));
        // as a SAX parser this will raise events/callback as and when it
        // comes to a element.
        int parserEvent = parser.getEventType();
        // we go thru a loop of all elements in the xml till we have
        // reached END of document.
        while (parserEvent != XmlPullParser.END_DOCUMENT) {
            switch (parserEvent) {
            // if u have reached start of a tag
            case XmlPullParser.START_TAG:
                // get the name of the tag
                String tag = parser.getName();

pretty much what I'm doing I'm just replacing the & with &amp; since I was dealing with parsing a URL. Hope this helps.



来源:https://stackoverflow.com/questions/2268895/parsing-invalid-ampersands-with-androids-xmlpullparsers

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!