XML Schema: Different Element Names (sequence)

流过昼夜 提交于 2019-12-12 23:05:50

问题


I think the solution to my problem is very easy, but i couldn't fint it So, here is:

I have an XML which have a list of elements with different names, but in sequence. An example:

<DOC>
 <DOC_OBL_1>
  <TIP_DOC_OBL>1</TIP_DOC_OBL> 
 </DOC_OBL_1>
 <DOC_OBL_2>
  <TIP_DOC_OBL>2</TIP_DOC_OBL> 
 </DOC_OBL_2>
 <DOC_OBL_3>
  <TIP_DOC_OBL>3</TIP_DOC_OBL>  
 </DOC_OBL_3>
</DOC>

So, i have 3 elements: DOC_OBL_1, DOC_OBL_2 and DOC_OBL_3. And yes, there could be number 4, 5, 6, etc. As you can se, all 3 have the same elements inside(actually, they have a lot of them, but arent important righ now), and I thinked i could declare a general type which could validate this kind of documents.

How can i validate this with an Schema???

I know its a very ugly XML (maybe it isnt standard, please tell me, i dont know), but It's not my concern to build this document. I just have to parse it, validate it and transform it.


回答1:


Well, sure you can! Pretty simple actually: if the structure is the same for each element, you can define a single <xs:complexType> to validate that, and then use:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema id="DOC" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="DOC">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="DOC_OBL_1" type="DocType" />
        <xs:element name="DOC_OBL_2" type="DocType" />
        <xs:element name="DOC_OBL_3" type="DocType" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:complexType name="DocType">
    <xs:sequence>
      <xs:element name="TIP_DOC_OBL" type="xs:string" minOccurs="0" />
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Does that work for you? Does it handle all your needs?

As Zach points out quite correctly - this "solution" obviously is rather limited, since it can't deal with an arbitrary number of tag DOC_OBL_1, DOC_OBL_2, ...., DOC_OBL_x - the name and thus the number of tags must be known ahead of time.

This is unfortunate, but it's the only solution, given this crippled XML. The REAL solution would be to have something like:

<DOC>
  <DOC_OBL id="1">
  </DOC_OBL>
  <DOC_OBL id="2">
  </DOC_OBL>
  .....
  <DOC_OBL id="x">
  </DOC_OBL>
</DOC>

and then the XML schema would become even easier and could deal with any number of <DOC_OBL> tags.

But the GIGO principle applies: Garbage In, Garbage Out ==> crappy XML structure comes in, only a crappy, incomplete validation is possible.

Marc




回答2:


Its unfortunate that the xml element names have basically sequence numbers/identifiers in them. I would say that's poorly defined (non standard) XML.

In my limited (!) experience, this means that the xsd schema would have to have a all the possible "DOC_OBL_N" elements defined in the sequence. This is probably not practical if there is no theoretical upper limit to their number.

As long as its valid xml, you could load it up and count all the children of the element DOC and then write the schema on the fly, but that sounds like its self defeating.

That may leave you with manually validating the xml instance using some xpaths - kind of a brute force approach and not technically validating against an xsd schema.



来源:https://stackoverflow.com/questions/1299609/xml-schema-different-element-names-sequence

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!