What expression would select all text nodes which are:
a
, or script
o
This should do, assuming "not inside" means the text node is not supposed to be a descendant of an "a" or "script" or "style" element. If "not inside" only means not supposed to be a child then use parent::a and so on instead of ancestor::a.
//text()[normalize-space() and not(ancestor::a | ancestor::script | ancestor::style)]
Use:
//*[not(self::a or self::script or self::style)]/text()[normalize-space()]
Not only is this expression shorter than the one in the currently accepted answer, but it also may be much more efficient.
Do note that the expression doesnt use any (back/up)-ward axes at all.
I used Dimitre Novatchev's answer, but then i stumbled upon the problem described by the topic starter:
not descendant of
a
,style
orscript
Dimitre's answer excludes style
tag but includes its children.
This version excludes also style
, script
, noscript
tags and their descendants:
//div[@id='???']//*[not(ancestor-or-self::script or ancestor-or-self::noscript or ancestor-or-self::style)]/text()
Anyway, thanks to Dimitre Novatchev.