dependency graph traversal in XSLT for copying related elements of an XML model

问题

I want to demonstrate XSL powerfullness for data exploration by solving the following problem: Given an xml file that describes some kind of "entity-relashionship" model, and for one entity in that model given by a name (assuming an attribute of the XML schema is used as identifier), I want a transformation that produce a new XML model that contains the given entity, plus all of its relatives as per the "Transitive closure of the dependencies relationship" of that given entity.

For example, the input XML model is

<root>
    <!-- my model is made of 3 entities : leaf, composite and object -->
    <!-- the xml elements are <leaves>, <composites> and <objects> are just placeholders for these entities -->
    <!-- These placeholders are exepected to be in that order in the output as well as in the input (Schema constraints) -->
    <leaves>
        <!-- A, B, C are 3 types of different leaf nodes with their proper semantic in the model -->
        <A name="f1" others="oooo"/>
        <A name="f2" others="xxxx"/>
        <B name="f3" others="ssss"/>
        <C name="f4" others="gggg"/>    
    </leaves>
    <composites>
        <!-- composites containes only struct and union element -->
        <struct name="structB" others="yyyy">
            <!-- composite pattern, struct can embed struct in a tree-ish fashion -->
            <sRef name="s6" nameRef="structA"/>
            <!-- order of declaration does not matter !!! here in the XML, structA is not yet declared but file is valid -->
            <uRef name="u7" nameRef="unionX"/>
        </struct>
        <!-- union is another kind of composition -->
        <union name="unionX" others="rrrr">
            <vRef name="u3" nameRef="f3" others="jjjj">
            <vRef name="u4" nameRef="f2" others="pppp">
        </union>
        <struct name="structA" others="hhhh">
            <vRef name="v1" nameRef="f1" others="jjjj">
            <vRef name="v2" nameRef="f4" others="pppp">
        </struct>
    </composites>
    <objects>
        <object name="objB" others="tttt">
            <field name="field1" nameRef="unionX" others="qqqq"/>
            <field name="field2" nameRef="f2" others="cccc"/>
        </object>
        <object name="objC" others="nnnn">
            <field name="fieldX" nameRef="structB" others="uuuu"/>
            <field name="fieldY" nameRef="" others="mmmm"/>
        </object>
        <object name="objMain" others="nnnn">
            <field name="fieldY" nameRef="structA" others="mmmm"/>
            <field name="fieldY" nameRef="f3" others="mmmm"/>
            <field name="object4" nameRef="objB" others="wwwww"/>
        </object>
    </objects>
<root>

I would like a transformation that,for a given name, creates a copy of the model with only information related to the element of this name, and of its dependencies described by the nameRef attributes.

so for the element "field1" the output would be

<root>
    <leaves>
        <A name="f1" others="oooo"/>
    </leaves>
    <!-- composites and objects placeholders shall be copied even when no elements in the graph traversal -->
    <composites/>
    <objects/>
<root>

whereas for "objB" the exepected output would be

<root>
    <leaves>
        <!-- element "f2" shall be copied only once in the output, althought the node is encountered twice in the traversal of "objB" tree :
            - "f2" is referenced under "field2" of "obj2"
            - "f2" is referenced under "u4" of "unionX" that is referencd under "field1" of "obj2"      
        -->
        <A name="f2" others="xxxx"/>
        <B name="f3" others="ssss"/>
    </leaves>
    <composites>
        <union name="unionX" others="rrrr">
            <vRef name="u3" nameRef="f3" others="jjjj">
            <vRef name="u4" nameRef="f2" others="pppp">
        </union>
    <composites>
    <objects>
        <object name="objB" others="tttt">
            <field name="field1" nameRef="unionX" others="qqqq"/>
            <field name="field2" nameRef="f2" others="cccc"/>
        </object>
    </objects>
<root>

and so on an so forth.

From now on, I workout on a basic XSL but not very satisfying for the following reasons :

my transformation is not based on a "identity rules" base for copying
my transformation use an xsl:copy-of when encountering matching entity, but this breaks the design and violates the XSD Schema
the output file is not compliant with the XML Schema Definition of the input, mostly becauseof the xsl:copy-of that violates the traversal of the XML elements
my transformation makes duplicate entities in the output when one appears several times in the transitive closure of the dependency relationship

I have only some feelings and "intuitions" about the good and elegant way to do it.

starting from an "identity transformation" template to respect the Xml Schema of the input
using grouping / sorting by key
implements some kind of "Muenchian Method" for it (not sure about it in fact, maybe just for XSLT 1.0)

For simplification you can make the following assumptions:

their are no situation of cycling dependencies (tree walk can be implemented)
nameRef / name are cross checked by a "key" in the XSD so that references are correct in the input
the input parameter "name" of the element to search for exists in the input xml model (although it would be nice to produce an "empty" valid xml in that case)

the "empty" xml output model should be as follows (due to schema constraints)

<root>
    <leaves/>
    <composites/>
    <objects/>
<root>

To complete : the xslt processor I am currently using is Saxon XSLT proc with and the version of XSLT is 2.0 Thanks for helping ... I don't give you the xsl that I am not proud of, but if it appears helpfull, I will ...

回答1:

I tried to implement "a transformation that,for a given name, creates a copy of the model with only information related to the element of this name, and of its dependencies described by the nameRef attributes" at https://xsltfiddle.liberty-development.net/gWEamLs/6:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:mf="http://example.com/mf"
    exclude-result-prefixes="#all"
    version="3.0">

  <xsl:param name="start-name" as="xs:string">objB</xsl:param>

  <xsl:key name="name-ref" match="*[@name]" use="@name"/>

  <xsl:function name="mf:traverse" as="element()*">
      <xsl:param name="start" as="element()?"/>
      <xsl:sequence select="$start, $start/*, $start/*[@nameRef]!key('name-ref', @nameRef, root(.))!mf:traverse(.)"/>
  </xsl:function>

  <xsl:param name="start-element" as="element()?" select="key('name-ref', $start-name)"/>

  <xsl:variable name="named-elements" select="mf:traverse($start-element)"/>

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:template match="*[@name and not(. intersect $named-elements)]"/>

</xsl:stylesheet>

Based on a key and a recursive function the code "first" computes the related elements as a sequence of element nodes in a global variable and "then" the identity transformation set up declaratively by <xsl:mode on-no-match="shallow-copy"/> just gets extended by an empty template for those elements having a name attribute but not having been found by the recursive function as being related to the start element, ensuring any not related elements that way don't get copied to the output.

来源：https://stackoverflow.com/questions/59226114/dependency-graph-traversal-in-xslt-for-copying-related-elements-of-an-xml-model

标签

xml

xslt

xslt-2.0

saxon

xslt-grouping