问题
I am trying to find all the element names which follow the below two rules.
1. elements should have the <set>erase</set>
2. if two or more elements have the <set>erase</set> in hierarchy (Ex: <b> and <d> both have <set>erase</set>) then only the parent node name has to be printed(ie <b> in this case).
So the required result for below xml needs to be :
b , y , p
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<a>
<b>
<set>erase</set>
<d>
<set>erase</set>
</d>
</b>
<c>
<x>
</x>
</c>
<e>
<y>
<set>erase</set>
<q>
</q>
</y>
<z>
<p>
<set>erase</set>
</p>
</z>
</e>
</a>
When I use the query = (//set[contains(.,'erase')])[1] I get only node b in result set.
When I use the query = //set[contains(.,'erase')] I get all nodesList b,d,y,p in result set.
Can anyone help me find the query to result in nodeList b , y and p.
Here is the java code snippet I used.
XPath xpath = factory.newXPath();
String query = "//set[contains(.,'erase')]";
XPathExpression expr=null;
try {
expr = xpath.compile(query);
} catch (XPathExpressionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Object result = null;
try {
result = expr.evaluate(doc, XPathConstants.NODESET);
} catch (XPathExpressionException e) {
e.printStackTrace();
}
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
String x = "";
Node n = nodes.item(i).getParentNode();
x=n.getNodeName();
while(!n.getNodeName().equalsIgnoreCase(request.getClass().getSimpleName())){
if ((n = n.getParentNode())!=null){
x=n.getNodeName()+"."+x;
}
}
System.out.println("Path: "+x);
output:
a.b
a.b.d
a.e.y
a.e.z.p
Could anyone help me figure out the query which results in only a.b , a.e.y and a.e.z.p
Let me know if you need more details. or any other use-case.
回答1:
One expression that selects exactly the wanted elements is:
//*[set[. = 'erase' and not(node()[2])]
and
not(ancestor::*
[set
[. = 'erase' and not(node()[2])]
]
)
]
XSLT - based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:for-each select=
"//*[set[. = 'erase' and not(node()[2])]
and
not(ancestor::*
[set
[. = 'erase' and not(node()[2])]
]
)
]">
<xsl:value-of select="name()"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
This transformation, when applied on the provided by Sean B. Durkin XML document:
<a>
<b>
<set>erase</set>
<set>
<a/>erase
</set>
<d>
<set>erase</set>
</d>
</b>
<c>
<x> </x>
</c>
<e>
<y>
<set>erase</set>
<q> </q>
</y>
<z>
<p>
<set>erase</set>
</p>
</z>
</e>
</a>
evaluates the XPath expression above and outputs the names of the selected elements -- the wanted, correct result is produced:
b
y
p
Do note that the following two expressions are quite incorrect:
*[set[text()='erase']][not(ancestor::*[set[text()='erase']])]
Or:
*[set[text()='erase']][ancestor::*[set[text()!='erase']]]
These two expressions suffer from more than one problem:
They are relative expressions and regardless with which initial context they are applied, they cannot select all wanted elements in an hierarchy with undefined depth and structure.
set[text()='erase']selects not only an element of the form:
...
<set>erase</set>
but also elements of the form:
<set>
xyz
<a/>erase</set>
.3. Similarly:
set[text()!='erase']
selects elements of the form:
<set>
xyz
<a/>erase</set>
回答2:
This is my second attempt:
//*[ set[count(node())=1 and text()='erase'] and
not( ancestor::*[ set[count(node())=1 and text()='erase']])
]
This selection passes the test case shown in my first answer.
回答3:
The following XPath selects the nodes that you want:
//*[set[text()='erase']][not(ancestor::*[set[text()='erase']])]
I tested it with the following stylesheet
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<xsl:template match="@*|text()" />
<xsl:template match="//*[set[text()='erase']][not(ancestor::*[set[text()='erase']])]">
<xsl:text>(</xsl:text>
<xsl:for-each select="self::*|ancestor::*">
<xsl:value-of select="name()"/>
<xsl:text>.</xsl:text>
</xsl:for-each>
<xsl:text>) </xsl:text>
</xsl:template>
</xsl:stylesheet>
It produced the output
(a.b.) (a.e.y.) (a.e.z.p.)
回答4:
Or this slight tweek on Harpo's answer?:
*[set[text()='erase']][ancestor::*[set[text()!='erase']]]
Following my comment on Novatchev's answer, please consider useful test case:
This is a change from the questionioner's demo document. I have added another node.
<?xml version="1.0"?>
<a>
<b>
<set>erase</set>
<set><a/>erase</set>
<d>
<set>erase</set>
</d>
</b>
<c>
<x>
</x>
</c>
<e>
<y>
<set>erase</set>
<q>
</q>
</y>
<z>
<p>
<set>erase</set>
</p>
</z>
</e>
</a>
Answer should be
b
y
p
来源:https://stackoverflow.com/questions/9271001/xpath-query-to-get-the-ancester-nodes-based-on-element-value