Speeding up xpath

后端 未结 6 636
广开言路
广开言路 2020-12-24 14:09

I have a 1000 entry document whose format is something like:


     
          
                   


        
6条回答
  •  孤城傲影
    2020-12-24 14:42

    If you need to parse huge but flat documents, SAX is a good alternative. It allows you to handle the XML as a stream instead of building a huge DOM. Your example could be parsed using a ContentHandler like this:

    import org.xml.sax.Attributes;
    import org.xml.sax.SAXException;
    import org.xml.sax.ext.DefaultHandler2;
    
    public class ExampleHandler extends DefaultHandler2 {
    
        private StringBuffer chars = new StringBuffer(1000);
    
        private MyEntry currentEntry;
        private MyEntryHandler myEntryHandler;
    
        ExampleHandler(MyEntryHandler myEntryHandler) {
            this.myEntryHandler = myEntryHandler;
        }
    
        @Override
        public void characters(char[] ch, int start, int length)
                throws SAXException {
            chars.append(ch);
        }
    
        @Override
        public void endElement(String uri, String localName, String qName)
                throws SAXException {
            if ("Entry".equals(localName)) {
                myEntryHandler.handle(currentEntry);
                currentEntry = null;
            }
            else if ("n1".equals(localName)) {
                currentEntry.setN1(chars.toString());
            }
            else if ("n2".equals(localName)) {
                currentEntry.setN2(chars.toString());
            }
        }
    
    
        @Override
        public void startElement(String uri, String localName, String qName,
                Attributes atts) throws SAXException {
            chars.setLength(0);
            if ("Entry".equals(localName)) {
                currentEntry = new MyEntry();
            }
        }
    }
    

    If the document has a deeper and more complex structure, you're going to need to use Stacks to keep track of the current path in the document. Then you should consider writing a general purpose ContentHandler to do the dirty work and use with your document type dependent handlers.

提交回复
热议问题