Parsing invalid ampersands with Android's XmlPullParsers

后端 未结 2 2147
醉酒成梦
醉酒成梦 2020-12-19 06:53

I am writing a little screen-scraping app that consumes some XHTML - it goes without saying that the XHTML is invalid: ampersands aren\'t escaped as &.<

相关标签:
2条回答
  • 2020-12-19 07:15

    I was stuck on this for about an hour before figuring out that in my case it was the "&" that couldn't be resolved by the XML PULL PARSER, so i found the solution. So Here is a snippet of code which totally fix it.

    void ParsingActivity(String r) {
        try {
            parserCreator = XmlPullParserFactory.newInstance();
            parser = parserCreator.newPullParser();
            // Here we give our file object in the form of a stream to the
            // parser.
            parser.setInput(new StringReader(r.replaceAll("&", "&amp;")));
            // as a SAX parser this will raise events/callback as and when it
            // comes to a element.
            int parserEvent = parser.getEventType();
            // we go thru a loop of all elements in the xml till we have
            // reached END of document.
            while (parserEvent != XmlPullParser.END_DOCUMENT) {
                switch (parserEvent) {
                // if u have reached start of a tag
                case XmlPullParser.START_TAG:
                    // get the name of the tag
                    String tag = parser.getName();
    

    pretty much what I'm doing I'm just replacing the & with &amp; since I was dealing with parsing a URL. Hope this helps.

    0 讨论(0)
  • 2020-12-19 07:16

    I would go with your first option, replacing the ampersands seems more of a fit solution than the other. The second option seems more of a hack to get it to work by accepting incorrect markup.

    0 讨论(0)
提交回复
热议问题