SAX parsing and special characters

风流意气都作罢 提交于 2019-12-17 14:55:13

问题


I want to parse some data from an xml file using SAX parser. My xml is as follows:

<categories>
 <cat>Pies &amp; past</cat>
 <cat>Fruits</cat>
</categories>

In order to parse this data I extend DefaultHandler.

The output after parsing is:

cat 1 = Pies

cat 2 = &

cat 3 = past

cat 4 = Fruits

Why is this happening instead of getting:

cat 1 = Pies & past

cat 2 = Fruits

回答1:


My guess is that you are treating each call to characters as delivering the complete text for a cat element. You should code your handler so that successive calls to characters accumulate the text, and you only capture it on the endElement event:

public class CatHandler extends DefaultHandler {
    private StringBuilder chars = new StringBuilder();

    public void startElement(String uri, String lName, String qName, Attributes a)
    {
        final String name = qName == null ? lName : qName;
        if ("cat".equals(name)) {
            chars.setLength(0);
        } else . . .
    }

    public void endElement(String uri, String lName, String qName) {
        final String name = qName == null ? lName : qName;
        if ("cat".equals(name)) {
            String catName = chars.toString();
            // do something with cat name
        } else . . .
    }

    public void characters(char[] ch, int start, int length) {
        chars.append(ch, start, length);
    }



回答2:


The characters() method doesn't have to return the complete text element. Rather you should collate the text available in each characters() call, and concatenate these upon the corresponding endElement() call.

From the doc:

The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks

(my emphasis)



来源:https://stackoverflow.com/questions/13336140/sax-parsing-and-special-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!