XPathEvalError: Unregistered function for matches() in lxml

佐手、 提交于 2019-11-29 15:44:08

问题


i am trying to use the following xpath query in python

from lxml.html.soupparser import fromstring
root = fromstring(inString)
nodes = root.xpath(".//p3[matches(.,'ABC')]//preceding::p2//p3")

but it gives me the error

  nodes = root.xpath(".//p3[matches(.,'ABC')]//preceding::p2//p3")
  File "lxml.etree.pyx", line 1507, in lxml.etree._Element.xpath (src\lxml\lxml.etree.c:52198)
  File "xpath.pxi", line 307, in lxml.etree.XPathElementEvaluator.__call__ (src\lxml\lxml.etree.c:152124)
  File "xpath.pxi", line 227, in lxml.etree._XPathEvaluatorBase._handle_result (src\lxml\lxml.etree.c:151097)
  File "xpath.pxi", line 212, in lxml.etree._XPathEvaluatorBase._raise_eval_error (src\lxml\lxml.etree.c:150896)
  lxml.etree.XPathEvalError: Unregistered function

how can i use XPath 2.0 functions here with lxml?

Clarification

I was using the contains function earlier as

nodes = root.xpath(".//p3[contains(text(),'ABC')]//preceding::p2//p3")

problem is that my xml has newlines and whitespaces in the text, hence i tried using something like

nodes = root.xpath(".//p3[contains(normalize-space(),'ABC')]//preceding::p2//p3")

but this has no effect. Finally i tried to use the matches function and i got the error.

Sample XML

<doc>

<q></q>

<p1>
    <p2 dd="ert" ji="pp">

        <p3>1</p3>
        <p3>2</p3>
        <p3>
               ABC
        </p3>
        <p3>3</p3>

     </p2>

     <p2 dd="ert" ji="pp">

        <p3>4</p3>
        <p3>5</p3>
        <p3>ABC</p3>
        <p3>6</p3>

     </p2>

</p1>
<r></r>
<p1>
    <p2 dd="ert" ji="pp">

        <p3>7</p3>
        <p3>8</p3>
        <p3>ABC
        </p3>
        <p3>9</p3>

     </p2>

     <p2 dd="ert" ji="pp">

        <p3>10</p3>
        <p3>11</p3>
        <p3>ABC</p3>
        <p3>12</p3>

     </p2>

</p1>
</doc>

回答1:


As mentioned in the other answer, stressing on the other part of the quoted documentation, you can use EXSLT extensions to have a regex match() function with lxml, for example :

......
ns = {"re": "http://exslt.org/regular-expressions"}
nodes = root.xpath(".//p3[re:match(.,'ABC')]//preceding::p2//p3", namespaces=ns)



回答2:


how can i use XPath 2.0 functions here with lxml?

You cannot (reference):

lxml supports XPath 1.0, XSLT 1.0 and the EXSLT extensions through libxml2 and libxslt in a standards compliant way.

contains() is probably the closest you can go in this case:

.//p3[contains(., 'ABC')]//preceding::p2//p3


来源:https://stackoverflow.com/questions/34047567/xpathevalerror-unregistered-function-for-matches-in-lxml

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!