问题
I have a node set constructed using the xsl:key structure in XSLT. I would like to find the lowest common ancestor (LCA) of all of the nodes in this node-set - any ideas?
I know about Kaysian intersects and XPath's intersect function, but these seem to be geared towards finding the LCA of just a pair of elements: I don't know in advance how many items will be in each node-set.
I was wondering if there might be a solution using a combination of the 'every' and 'intersect' expressions, but I haven't been able to think of one yet!
Thanks in advance, Tom
回答1:
Here is a bottom-up approach:
<xsl:function name="my:lca" as="node()?">
<xsl:param name="pSet" as="node()*"/>
<xsl:sequence select=
"if(not($pSet))
then ()
else
if(not($pSet[2]))
then $pSet[1]
else
if($pSet intersect $pSet/ancestor::node())
then
my:lca($pSet[not($pSet intersect ancestor::node())])
else
my:lca($pSet/..)
"/>
</xsl:function>
A test:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="my:my">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="vSet1" select=
"//*[self::A.1.1 or self::A.2.1]"/>
<xsl:variable name="vSet2" select=
"//*[self::B.2.2.1 or self::B.1]"/>
<xsl:variable name="vSet3" select=
"$vSet1 | //B.2.2.2"/>
<xsl:template match="/">
<!---->
<xsl:sequence select="my:lca($vSet1)/name()"/>
=========
<xsl:sequence select="my:lca($vSet2)/name()"/>
=========
<xsl:sequence select="my:lca($vSet3)/name()"/>
</xsl:template>
<xsl:function name="my:lca" as="node()?">
<xsl:param name="pSet" as="node()*"/>
<xsl:sequence select=
"if(not($pSet))
then ()
else
if(not($pSet[2]))
then $pSet[1]
else
if($pSet intersect $pSet/ancestor::node())
then
my:lca($pSet[not($pSet intersect ancestor::node())])
else
my:lca($pSet/..)
"/>
</xsl:function>
</xsl:stylesheet>
When this transformation is applied on the following XML document:
<t>
<A>
<A.1>
<A.1.1/>
<A.1.2/>
</A.1>
<A.2>
<A.2.1/>
</A.2>
<A.3/>
</A>
<B>
<B.1/>
<B.2>
<B.2.1/>
<B.2.2>
<B.2.2.1/>
<B.2.2.2/>
</B.2.2>
</B.2>
</B>
</t>
the wanted, correct result is produced for all three cases:
A
=========
B
=========
t
Update: I have what I think is probably the most efficient algorithm.
The idea is that the LCA of a node-set is the same as the LCA of just two nodes of this node-set: the "leftmost" and the "rightmost" ones. The proof that this is correct is left as an exercise for the reader :)
Here is a complete XSLT 2.0 implementation:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="my:my">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="vSet1" select=
"//*[self::A.1.1 or self::A.2.1]"/>
<xsl:variable name="vSet2" select=
"//*[self::B.2.2.1 or self::B.1]"/>
<xsl:variable name="vSet3" select=
"$vSet1 | //B.2.2.2"/>
<xsl:template match="/">
<xsl:sequence select="my:lca($vSet1)/name()"/>
=========
<xsl:sequence select="my:lca($vSet2)/name()"/>
=========
<xsl:sequence select="my:lca($vSet3)/name()"/>
</xsl:template>
<xsl:function name="my:lca" as="node()?">
<xsl:param name="pSet" as="node()*"/>
<xsl:sequence select=
"if(not($pSet))
then ()
else
if(not($pSet[2]))
then $pSet[1]
else
for $n1 in $pSet[1],
$n2 in $pSet[last()]
return my:lca2nodes($n1, $n2)
"/>
</xsl:function>
<xsl:function name="my:lca2nodes" as="node()?">
<xsl:param name="pN1" as="node()"/>
<xsl:param name="pN2" as="node()"/>
<xsl:variable name="n1" select=
"($pN1 | $pN2)
[count(ancestor-or-self::node())
eq
min(($pN1 | $pN2)/count(ancestor-or-self::node()))
]
[1]"/>
<xsl:variable name="n2" select="($pN1 | $pN2) except $n1"/>
<xsl:sequence select=
"$n1/ancestor-or-self::node()
[exists(. intersect $n2/ancestor-or-self::node())]
[1]"/>
</xsl:function>
</xsl:stylesheet>
when this transformation is performed on the same XML document (above), the same correct result is produced, but much faster -- especially if the size of the node-set is big:
A
=========
B
=========
t
回答2:
I tried the following:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="xs mf"
version="2.0">
<xsl:output method="html" indent="yes"/>
<xsl:function name="mf:lca" as="node()?">
<xsl:param name="nodes" as="node()*"/>
<xsl:variable name="all-ancestors" select="$nodes/ancestor::node()"/>
<xsl:sequence
select="$all-ancestors[every $n in $nodes satisfies exists($n/ancestor::node() intersect .)][last()]"/>
</xsl:function>
<xsl:template match="/">
<xsl:sequence select="mf:lca(//foo)"/>
</xsl:template>
</xsl:stylesheet>
Tested with the sample
<root>
<anc1>
<anc2>
<foo/>
<bar>
<foo/>
</bar>
<bar>
<baz>
<foo/>
</baz>
</bar>
</anc2>
</anc1>
</root>
I get the anc2
element but I haven't tested with more complex settings and don't have the time now. Maybe you can try with your sample data and report back whether you get the results you want.
回答3:
Martin's solution will work, but I think it could be quite expensive in some situations, with a lot of elimination of duplicates. I'd be inclined to use an approach that finds the LCA of two nodes, and then use this recursively, on the theory that LCA(x,y,z) = LCA(LCA(x,y),z) [a theory which I leave the reader to prove...].
Now LCA(x,y) can be found fairly efficiently by looking at the sequences x/ancestor-or-self::node() and y/ancestor-or-self::node(), truncating both sequences to the length of the shorter, and then finding the last node that is in both: in XQuery notation:
( let $ax := $x/ancestor-or-self::node()
let $ay := $y/ancestor-or-self::node()
let $len := min((count($ax), count($ay))
for $i in reverse($len to 1)
where $ax[$i] is $ay[$i]
return $ax[$i]
)[1]
来源:https://stackoverflow.com/questions/8742002/finding-the-lowest-common-ancestor-of-an-xml-node-set