问题
Is it feasible in Java using the SAX api to parse a list of XML fragments with no root element from a stream input?
I tried parsing such an XML but got a
org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.
before even the endDocument event was fired.
I would like not to settle with obvious but clumsy solutions as "Pre-append a custom root element or Use buffered fragment parsing".
I am using the standard SAX API of Java 1.6. The SAX factory had setValidating(false) in case anyone wondered.
回答1:
First, and most important of all, the content you are parsing is not an XML document. From the XML Specification:
[Definition: There is exactly one element, called the root, or document element, no part of which appears in the content of any other element.]
Now, as to parsing this with SAX - in spite of what you said about clumsiness - I'd suggest the following approach:
Enumeration<InputStream> streams = Collections.enumeration(
Arrays.asList(new InputStream[] {
new ByteArrayInputStream("<root>".getBytes()),
yourXmlLikeStream,
new ByteArrayInputStream("</root>".getBytes()),
}));
SequenceInputStream seqStream = new SequenceInputStream(streams);
// Now pass the `seqStream` into the SAX parser.
Using the SequenceInputStream is a convenient way of concatenating multiple input streams into a single stream. They will be read in the order they are passed to the constructor (or in this case - returned by the Enumeration
).
Pass it to your SAX parser, and you are done.
来源:https://stackoverflow.com/questions/11226747/parse-a-list-of-xml-fragments-with-no-root-element-from-a-stream-input