xpath return all non-blank text nodes not descendant of `a`, `style` or `script`

后端 未结 3 1442
梦毁少年i
梦毁少年i 2020-12-06 10:07

What expression would select all text nodes which are:

  • not blank
  • not inside a, or script o
相关标签:
3条回答
  • 2020-12-06 10:41

    This should do, assuming "not inside" means the text node is not supposed to be a descendant of an "a" or "script" or "style" element. If "not inside" only means not supposed to be a child then use parent::a and so on instead of ancestor::a.

    //text()[normalize-space() and not(ancestor::a | ancestor::script | ancestor::style)]
    
    0 讨论(0)
  • 2020-12-06 10:41

    Use:

    //*[not(self::a or self::script or self::style)]/text()[normalize-space()]
    

    Not only is this expression shorter than the one in the currently accepted answer, but it also may be much more efficient.

    Do note that the expression doesnt use any (back/up)-ward axes at all.

    0 讨论(0)
  • 2020-12-06 10:46

    I used Dimitre Novatchev's answer, but then i stumbled upon the problem described by the topic starter:

    not descendant of a, style or script

    Dimitre's answer excludes style tag but includes its children. This version excludes also style, script, noscript tags and their descendants:

    //div[@id='???']//*[not(ancestor-or-self::script or ancestor-or-self::noscript or ancestor-or-self::style)]/text()
    

    Anyway, thanks to Dimitre Novatchev.

    0 讨论(0)
提交回复
热议问题