Annotating an xml instance from a list of xpath statements with xslt

不羁岁月 提交于 2019-12-02 01:24:15

Overview:

Write a meta XSLT transformation that takes the paths file as input and produces a new XSLT transformation as output. This new XSLT will transform from your root input XML to the annotated copy output XML.

Notes:

  1. Works with XSLT 1.0, 2.0, or 3.0.
  2. Should be very efficient, especially if the generated transformation has to be run over a large input or has to be run repeatedly, because it effectively compiles into native XSLT rather than reimplementing matching with an XSLT-based interpreter.
  3. Is more robust than approaches that have to rebuild element ancestry manually in code. Since it maps the paths to template/@match attributes, the full sophistication of @matching is available efficiently. I've included an attribute value test as an example.
  4. Be sure to consider elegant XSLT 2.0 and 3.0 solutions by @DanielHaley and @MartinHonnen, especially if an intermediate meta XSLT file won't work for you. By leveraging XSLT 3.0's XPath evaluation facilities, @MartinHonnen's answer appears to be able to provide even more robust matching than template/@match does here.

This input XML that specifies XPaths and annotations:

<paths>
  <xpath location="/root/a" annotate="1"/>
  <xpath location="/root/a/b" annotate="2"/>
  <xpath location="/root/c[@x='123']" annotate="3"/>
</paths>

When input to this meta XSLT transformation:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="/paths">
    <xsl:element name="xsl:stylesheet">
      <xsl:attribute name="version">1.0</xsl:attribute>
      <xsl:element name="xsl:output">
        <xsl:attribute name="method">xml</xsl:attribute>
        <xsl:attribute name="indent">yes</xsl:attribute>
      </xsl:element>
      <xsl:call-template name="gen_identity_template"/>
      <xsl:apply-templates select="xpath"/>
    </xsl:element>
  </xsl:template>

  <xsl:template name="gen_identity_template">
    <xsl:element name="xsl:template">
      <xsl:attribute name="match">node()|@*</xsl:attribute>
      <xsl:element name="xsl:copy">
        <xsl:element name="xsl:apply-templates">
          <xsl:attribute name="select">node()|@*</xsl:attribute>
        </xsl:element>
      </xsl:element>
    </xsl:element>
  </xsl:template>

  <xsl:template match="xpath">
    <xsl:element name="xsl:template">
      <xsl:attribute name="match">
        <xsl:value-of select="@location"/>
      </xsl:attribute>
      <xsl:element name="xsl:comment">
        <xsl:value-of select="@annotate"/>
      </xsl:element>
      <xsl:element name="xsl:text">
        <xsl:text disable-output-escaping="yes">&amp;#xa;</xsl:text>
      </xsl:element>
      <xsl:element name="xsl:copy">
        <xsl:element name="xsl:apply-templates">
          <xsl:attribute name="select">node()|@*</xsl:attribute>
        </xsl:element>
      </xsl:element>
    </xsl:element>
  </xsl:template>
</xsl:stylesheet>

Will produce this XSLT transformation:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
   <xsl:output method="xml" indent="yes"/>
   <xsl:template match="node()|@*">
      <xsl:copy>
         <xsl:apply-templates select="node()|@*"/>
      </xsl:copy>
   </xsl:template>
   <xsl:template match="/root/a">
      <xsl:comment>1</xsl:comment>
      <xsl:text>&#xa;</xsl:text>
      <xsl:copy>
         <xsl:apply-templates select="node()|@*"/>
      </xsl:copy>
   </xsl:template>
   <xsl:template match="/root/a/b">
      <xsl:comment>2</xsl:comment>
      <xsl:text>&#xa;</xsl:text>
      <xsl:copy>
         <xsl:apply-templates select="node()|@*"/>
      </xsl:copy>
   </xsl:template>
   <xsl:template match="/root/c[@x='123']">
      <xsl:comment>3</xsl:comment>
      <xsl:text>&#xa;</xsl:text>
      <xsl:copy>
         <xsl:apply-templates select="node()|@*"/>
      </xsl:copy>
   </xsl:template>
</xsl:stylesheet>

Which, when provided this input XML file:

<root>
  <a>
    <b>B</b>
  </a>
  <c x="123">C</c>
</root>

Will produce the desired output XML file:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <!--1-->
   <a>
    <!--2-->
      <b>B</b>
  </a>
  <!--3-->
   <c x="123">C</c>
</root>

Assuming Saxon 9 PE or EE, it should also be possible to make use XSLT 3.0 and of xsl:evaluate as follows:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:math="http://www.w3.org/2005/xpath-functions/math"
    xmlns:map="http://www.w3.org/2005/xpath-functions/map"
    xmlns:mf="http://example.com/mf"
    exclude-result-prefixes="xs math map mf"
    version="3.0">

    <xsl:output indent="yes"/>

    <xsl:param name="paths-url" as="xs:string" select="'paths1.xml'"/>
    <xsl:param name="paths-doc" as="document-node()" select="doc($paths-url)"/>

    <xsl:variable name="main-root" select="/"/>

    <xsl:variable 
        name="mapped-nodes">
        <map>
            <xsl:for-each select="$paths-doc/paths/xpath">
                <xsl:variable name="node" as="node()?" select="mf:evaluate(@location, $main-root)"/>
                <xsl:if test="$node">
                    <entry key="{generate-id($node)}">
                        <xsl:value-of select="@annotate"/>
                    </entry>
                </xsl:if>
            </xsl:for-each>
        </map>
    </xsl:variable>

    <xsl:key name="node-by-id" match="map/entry" use="@key"/>

    <xsl:function name="mf:evaluate" as="node()?">
        <xsl:param name="path" as="xs:string"/>
        <xsl:param name="context" as="node()"/>
        <xsl:evaluate xpath="$path" context-item="$context"></xsl:evaluate>
    </xsl:function>

    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* , node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="node()[key('node-by-id', generate-id(), $mapped-nodes)]">
        <xsl:comment select="key('node-by-id', generate-id(), $mapped-nodes)"/>
        <xsl:text>&#10;</xsl:text>
        <xsl:copy>
            <xsl:apply-templates select="@* , node()"/>
        </xsl:copy>
    </xsl:template>


</xsl:stylesheet>

Here is an edited version of the originally posted code that uses the XSLT 3.0 map feature instead of a temporary document to store the association between the generated id of a node found by dynamic XPath evaluation and the annotation:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:math="http://www.w3.org/2005/xpath-functions/math"
    xmlns:map="http://www.w3.org/2005/xpath-functions/map"
    xmlns:mf="http://example.com/mf"
    exclude-result-prefixes="xs math map mf"
    version="3.0">

    <xsl:param name="paths-url" as="xs:string" select="'paths1.xml'"/>
    <xsl:param name="paths-doc" as="document-node()" select="doc($paths-url)"/>

    <xsl:output indent="yes"/>

    <xsl:variable 
        name="mapped-nodes"
        as="map(xs:string, xs:string)"
        select="map:new(for $path in $paths-doc/paths/xpath, $node in mf:evaluate($path/@location, /) return map:entry(generate-id($node), string($path/@annotate)))"/>

    <xsl:function name="mf:evaluate" as="node()?">
        <xsl:param name="path" as="xs:string"/>
        <xsl:param name="context" as="node()"/>
        <xsl:evaluate xpath="$path" context-item="$context"></xsl:evaluate>
    </xsl:function>

    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* , node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="node()[map:contains($mapped-nodes, generate-id())]">
        <xsl:comment select="$mapped-nodes(generate-id())"/>
        <xsl:text>&#10;</xsl:text>
        <xsl:copy>
            <xsl:apply-templates select="@* , node()"/>
        </xsl:copy>
    </xsl:template>


</xsl:stylesheet>

As the first stylesheet, it needs Saxon 9.5 PE or EE to be run.

I'm not sure if kjhughes' suggestion of creating a second transform would be more efficient than your original idea or not. I do see the possibility of that second transform becoming huge if your paths XML gets large.

Here's how I'd do it...

XML Input

<root>
    <a>
        <b>B</b>
    </a>
    <c>C</c>
</root>

"paths" XML (paths.xml)

<paths>
    <xpath location="/root/a" annotate="1"/>
    <xpath location="/root/a/b" annotate="2"/>
</paths>

XSLT 2.0

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:param name="paths" select="document('paths.xml')"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="*" priority="1">
        <xsl:variable name="path">
            <xsl:for-each select="ancestor-or-self::*">
                <xsl:value-of select="concat('/',local-name())"/>
            </xsl:for-each>
        </xsl:variable>
        <xsl:if test="$paths/*/xpath[@location=$path]">
            <xsl:comment select="$paths/*/xpath[@location=$path]/@annotate"/>
        </xsl:if>
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

XML Output

<root>
    <!--1-->
    <a>
        <!--2-->
        <b>B</b>
    </a>
    <c>C</c>
</root>
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!