I have a 1000 entry document whose format is something like:
If you need to parse huge but flat documents, SAX is a good alternative. It allows you to handle the XML as a stream instead of building a huge DOM. Your example could be parsed using a ContentHandler like this:
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.ext.DefaultHandler2;
public class ExampleHandler extends DefaultHandler2 {
private StringBuffer chars = new StringBuffer(1000);
private MyEntry currentEntry;
private MyEntryHandler myEntryHandler;
ExampleHandler(MyEntryHandler myEntryHandler) {
this.myEntryHandler = myEntryHandler;
}
@Override
public void characters(char[] ch, int start, int length)
throws SAXException {
chars.append(ch);
}
@Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
if ("Entry".equals(localName)) {
myEntryHandler.handle(currentEntry);
currentEntry = null;
}
else if ("n1".equals(localName)) {
currentEntry.setN1(chars.toString());
}
else if ("n2".equals(localName)) {
currentEntry.setN2(chars.toString());
}
}
@Override
public void startElement(String uri, String localName, String qName,
Attributes atts) throws SAXException {
chars.setLength(0);
if ("Entry".equals(localName)) {
currentEntry = new MyEntry();
}
}
}
If the document has a deeper and more complex structure, you're going to need to use Stacks to keep track of the current path in the document. Then you should consider writing a general purpose ContentHandler to do the dirty work and use with your document type dependent handlers.