Validate XML against multiple arbitrary schemas

匿名 (未验证) 提交于 2019-12-03 03:03:02

问题:

Consider an XML document that starts like the following with multiple schemas (this is NOT a Spring-specific question; this is just a convenient XML doc for the example):

<beans xmlns="http://www.springframework.org/schema/beans"        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"        xmlns:jaxrs="http://cxf.apache.org/jaxrs"        xmlns:osgi="http://www.springframework.org/schema/osgi"        xsi:schemaLocation="http://www.springframework.org/schema/beans                http://www.springframework.org/schema/beans/spring-beans-2.0.xsd            http://cxf.apache.org/jaxrs                http://cxf.apache.org/schemas/jaxrs.xsd            http://www.springframework.org/schema/osgi                http://www.springframework.org/schema/osgi/spring-osgi.xsd"> 

I want to validate the document, but I don't know in advance which namespaces the document author will use. I trust the document author, so I'm willing to download arbitrary schema URLs. How do I implement my validator?

I know that I can specify my schemas with a DocumentBuilderFactory instance my calling setAttribute("http://java.sun.com/xml/jaxp/properties/schemaSource", new String[] {...}) but I don't know the schema URLs until the document is parsed.

Of course, I could extract the XSD URLs myself after parsing the document and then running it through the validator specifying the "http://java.sun.com/xml/jaxp/properties/schemaSource" as above, but surely there's already an implementation that does that automatically?

回答1:

Forgive me for answering my own question... The other answers from @Eugene Yokota and @forty-two were VERY helpful, but I thought they were not complete enough to accept. I needed to do additional work to compose the suggestions into the final solution below. The following works perfectly under JDK 1.6. It does not have sufficient error checking (see the link in Eugene's answer that is a very complete solution -- but is not reusable) nor does it cache the downloaded XSDs, I believe. I think it exploits specific features of the Xerces parser, because of the apache.org feature URLs.

    InputStream xmlStream = ...      DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();     factory.setNamespaceAware(true);     factory.setValidating(true);     factory.setXIncludeAware(true);     factory.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema");     factory.setFeature("http://apache.org/xml/features/validation/schema-full-checking", true);     factory.setFeature("http://apache.org/xml/features/honour-all-schemaLocations", true);     factory.setFeature("http://apache.org/xml/features/validate-annotations", true);     factory.setFeature("http://apache.org/xml/features/generate-synthetic-annotations", true);      DocumentBuilder builder = factory.newDocumentBuilder();     builder.setErrorHandler(new ErrorHandler() {         public void warning(SAXParseException exception) throws SAXException {             LOG.log(Level.WARNING, "parse warn: " + exception, exception);         }         public void error(SAXParseException exception) throws SAXException {             LOG.log(Level.SEVERE, "parse error: " + exception, exception);         }         public void fatalError(SAXParseException exception) throws SAXException {             LOG.log(Level.SEVERE, "parse fatal: " + exception, exception);         }     });      Document doc = builder.parse(xmlStream); 


回答2:

I haven't confirmed this but you might find Use JAXP Validation API to create a validator and validate input from a DOM which contains inline schemas and multiple validation roots useful.

In particular,

factory.setFeature(SCHEMA_FULL_CHECKING_FEATURE_ID, schemaFullChecking);  factory.setFeature(HONOUR_ALL_SCHEMA_LOCATIONS_ID, honourAllSchemaLocations); 


回答3:

If you create a DocumentBuilderFactory like so:

    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();     dbf.setValidating(true);     dbf.setNamespaceAware(true);     dbf.setAttribute(             "http://java.sun.com/xml/jaxp/properties/schemaLanguage",             "http://www.w3.org/2001/XMLSchema"); 

You can then set an EntityResolver on the DocumentBuilder instances created by this factory to get a chance to resolve the schema locations referred to in the directives. The specified location will be present in the systemIdargument.

I thought the builder would do this automatically, without specifying a resolver, but obviously not out of the box. May be it is controlled by another feature, attribute or property?



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!