XPath for selecting a section of an article

前端 未结 2 1220
醉话见心
醉话见心 2021-01-17 02:59

Suppose a section of an article is as follows (the html source):

Introduction

....

References

...a bunch of tex
2条回答
  •  日久生厌
    2021-01-17 03:15

    if you want elements until next h2 use such xpath

    //*[following-sibling::h2[preceding-sibling::h2[1][contains(.,'References')]]  and preceding-sibling::h2[contains(.,'References')]]
    

    Wath does it mean: it finds all element which has

    -- ahead h2 which has the 1st preceding h2 containing 'References'

    -- back h2 containing 'References'

    The 1st rule takes all elements from begining of xml until next h2 tag. The 2nd -all after necessary h2 tag to end of xml. Intersection of them gives needed elements.

    Or xpath maybe build on your suggestion:

    //h2[.='References']/following-sibling::*[preceding-sibling::h2[1][contains(.,'References')] and not(name()='h2')]
    

    take all after necessary h2 tag //h2[.='References']/following-sibling::* which is not h2 and has our h2 tag as the 1st h2 before

提交回复
热议问题