lxml.html parsing with XPath and variables

后端 未结 2 1968
被撕碎了的回忆
被撕碎了的回忆 2021-01-02 17:16

I have this HTML snippet

Table of Contents

2条回答
  •  北海茫月
    2021-01-02 17:43

    Your first example woks, but probably not how you think it shoud:

    test=html.xpath("//ul[@class='toc']/li[@class='level2']/div[@class='li']/a/text()='One'")
    

    What this returns is a boolean, which will be true if the condition ...='One' is true for any of the nodes in the result set at the left side of the xpath expression. And that's why you get the error in your second example: True[0] is not valid.

    You probalby want all nodes matching the expession, having 'One' as text. The corresponding expression would be:

    test=html.xpath("//ul[@class='toc']/li[@class='level2']/div[@class='li']/a[text()='One']")
    

    This returns a nodeset as result, or if you just need the url as a string:

    test=html.xpath("//ul[@class='toc']/li[@class='level2']/div[@class='li']/a[text()='One']/@href")
    # returns: ['#link1']
    

提交回复
热议问题