Xpath deepest node whose string content is longer than a given length

别来无恙 提交于 2019-12-11 06:27:00

问题


How does one use XPath to find the deepest node that matches a string content length constraint.

Given a chunk of XHTML (or XML) that looks like this:

<html>
    <body>
        <div id="page">
             <div id="desc">
                  This wool sweater has the following features:
                  <ul>
                       <li>4 buttons</li>
                       <li>Merino Wool</li>
                  </ul>
             </div>
        </div>
        ...
     </body>
</html>

An an XPath expression like

//*[string-length() > 50]

Would match the <html>, <body>, <div id="page"> and <div id="desc">. How can one make XPath pick the deepest matching node (ie: <div id="desc">)?

Bonus points, how does one apply the constraint to space normalized content length?


回答1:


This cannot be expressed as a single XPath 1.0 expression (not using variables)

A single XPath 2.0 expression:

//*[string-length(.) > 50]
      [count(ancestor::*) >= //*[string-length(.) > 50]/count(ancestor::*)]

An XPath 1.0 expression using a variable:

//*[string-length() > 50]
         [not(//*[string-length() > 50 
        and count(ancestor::*) > $vNumAncestrors])
         ]

where the variable vNumAncestrors holds the value of count(ancestor::*) for the context node.

The latter expression can be implemented in a hosting language, such as XSLT 1.0 or DOM.

Here is one XSLT 1.0 implementation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/*">
  <xsl:variable name="vLongTextElements"
   select="//*[string-length()>50]"/>

  <xsl:for-each select="$vLongTextElements">
   <xsl:variable name="vNumAncestrors"
        select="count(ancestor::*)"/>

    <xsl:copy-of select=
    "(.)[not(//*[string-length() > 50
            and count(ancestor::*) > $vNumAncestrors])
         ]
    "/>
  </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the provided XML document:

<html>
    <body>
        <div id="page">
            <div id="desc">                                This wool sweater has the following features:                                
                <ul>
                    <li>4 buttons</li>
                    <li>Merino Wool</li>
                </ul>
            </div>
        </div>                      ...                   
    </body>
</html>

the wanted, correct result is produced:

<div id="desc">                                This wool sweater has the following features:                                
                <ul>

      <li>4 buttons</li>

      <li>Merino Wool</li>

   </ul>

</div>

Bonus points, how does one apply the constraint to space normalized content length?

Very simple to implement atop of the last solution:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/*">
  <xsl:variable name="vLongTextElements"
   select="//*[string-length(normalize-space())>50]"/>

  <xsl:for-each select="$vLongTextElements">
   <xsl:variable name="vNumAncestrors"
        select="count(ancestor::*)"/>

    <xsl:copy-of select=
    "(.)[not(//*[string-length(normalize-space()) > 50
            and count(ancestor::*) > $vNumAncestrors])
         ]
    "/>
  </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

And the initial XPath 2.0 expression is now modified to this one:

//*[string-length(normalize-space(.)) > 50]
      [count(ancestor::*) 
     >= 
      //*[string-length(normalize-space(.)) > 50]/count(ancestor::*)
      ]



回答2:


As Dimitre have pointed out, the problem for solving this in XPath 1.0 is that maximum expression works only for not calculated values:

$node-set[not($node-set/node-or-attribute > node-or-attribute)]

That's why in XSLT 1.0 you would use the "standar" maximum construction:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="/">
        <xsl:for-each select="//*[string-length(normalize-space())>50]">
            <xsl:sort select="count(ancestor::*)" 
                      data-type="number" order="descending"/>
            <xsl:if test="position()=1">
                <xsl:copy-of select="."/>
            </xsl:if>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

Output:

<div id="desc">                   This wool sweater has the following features:                   
                <ul>
<li>4 buttons</li>
<li>Merino Wool</li>
</ul>
</div>


来源:https://stackoverflow.com/questions/4493323/xpath-deepest-node-whose-string-content-is-longer-than-a-given-length

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!