XML Validation in Java: processContents=“lax” seems not to work correctly

大憨熊 提交于 2019-12-05 00:35:29

问题


I have an XML Schema which contains a number of

<any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded" />

definitions, i.e., it allows to insert arbitrary tags of other namespaces. processContents="lax" indicates that the parser should try do validate these tags, if it has the according schema (1) (2).

For me this means, that if I give the parser all schema documents, and there is an invalid XML tag of one of the secondary namespaces, it needs to report an error.

However, it seems that the Java XML validator ignores such errors. I have verified that the parser has all the necessary schema documents to perform the validation (if I change the XML schema to processContents="strict", it works as expected and uses the secondary schema documents for validation). It seems that for the validator behaves as if the attribute is specified with value skip.

Java code for validation:

/*
 * xmlDokument is the file name of the XML document
 * xsdSchema is an array with all schema documents
 */
public static void validate( String xmlDokument, Source[] xsdSchema ) throws SAXException, IOException {   
  SchemaFactory schemaFactory = SchemaFactory.newInstance( XMLConstants.W3C_XML_SCHEMA_NS_URI );
  Schema schema = schemaFactory.newSchema( xsdSchema );
  Validator validator = schema.newValidator();
  validator.setErrorHandler( new MyErrorHandler() );
  validator.validate( new StreamSource(new File(xmlDokument)) );
}

Minimal example:

The primary schema:

<xs:schema
    xmlns="baseNamespace"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    targetNamespace="baseNamespace"
    xmlns:tns="baseNamespace">

<!-- Define single tag "baseTag" -->
<xs:element name="baseTag">
  <xs:complexType>
    <xs:sequence>
      <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>
</xs:schema>

The secondary schema:

<xs:schema
    xmlns="secondaryNamespace"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    targetNamespace="secondaryNamespace"
    xmlns:tns="secondaryNamespace"
    elementFormDefault="qualified"
    attributeFormDefault="qualified">

<xs:element name="additionalTag"/>

</xs:schema>

The XML document I am trying to validate:

<baseTag
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns="baseNamespace"
  xmlns:secondary="secondaryNamespace"
  xsi:schemaLocation="
    baseNamespace base.xsd
    secondaryNamespace secondary.xsd">

  <secondary:additionalTag/>
  <secondary:invalidTag/>
</baseTag>

Using the above Java code giving both schema documents does not produce any validation errors, only if I change the lax to strict in the base schema (which I don't want). The error message in this case is

cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'secondary:invalidTag'.

Questions:

Did I misunderstand something and is this actually the correct behavior? Or am I right regarding processContents?

Are my schema documents doing the right thing?

Is my Java code correct? How could I change it so that it behaves as expected?


回答1:


According to the spec:

"It will validate elements and attributes for which it can obtain schema information, but it will not signal errors for those it cannot obtain any schema information."

So, when you use procesContents "lax", the validator cannot find a schema for the "invalidTag" and therefore ignores it, as per the spec.



来源:https://stackoverflow.com/questions/7820774/xml-validation-in-java-processcontents-lax-seems-not-to-work-correctly

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!