问题
I need to extract certain text elements from the following code.
<div class="inhalt-links">
<h2>
Deutsche Verkehrswacht
<br>
Verkehrswacht Dortmund e. V.
<br>
</h2>
<h3>
Standnummer:
<span style="font-weight: normal;">4.E08</span>
</h3>
<div class="clear"></div>
<br>
Benediktinerstraße 82
<br>
44287 Dortmund
<br>
Deutschland
<br>
<br>
Tel.:+49 231 447687
<br>
Fax:+49 231 447136
<br>
E-Mail:info@verkehrswacht-dortmund.de
<br>
<a href="http://www.verkehrswacht-dortmund.de" class="url" target="_blank">www.verkehrswacht-dortmund.de</a>
<br>
<div class="social"></div>
<br>
</div>
For extracting the Tel.:+49 231 447687, i can use div[@class='inhalt-links']/text()[4]. And for other details like Fax, Email, Website, i just need to change the position number of text() element. But, the position of these texts will be of different order sometimes, like in the following code:
<div class="inhalt-links">
<h2>
DEW21
<br>
</h2>
<h3>
Standnummer:
<span style="font-weight: normal;">4.B56</span>
</h3>
<div class="clear"></div>
<br>
Günter-Samtlebe-Platz 1
<br>
44135 Dortmund
<br>
Postfach:104141
<br>
44041 Dortmund
<br>
Deutschland
<br>
<br>
Tel.:+49 231 544-0
<br>
Fax:+49 231 544-1130
<br>
E-Mail:vertrieb@dew21.de
<br>
<a href="http://www.dew21.de" class="url" target="_blank">www.dew21.de</a>
<br>
<div class="social"></div>
<br>
</div>
The xpath div[@class='inhalt-links']/text()[4] will select the text "44041 Dortmund" instead of Tel.:+49 231 544-0. Is there any xpath like "div[@class='inhalt-links']/text[starts with "Tel.:"]" to select the Tel.:element?
回答1:
" Is there any xpath like
"//div[@class='inhalt-links']/text[starts with "Tel.:"]"to select theTel.:element?"
Sure, try this way :
//div[@class='inhalt-links']/text()[starts-with(normalize-space(), 'Tel.:')]
The XPath returns text node -rather than element- that starts with, after removing leading and trailing whitespaces*, the keyword Tel.:.
*) For reference of what normalize-space() is doing more precisely :
The
normalize-spacefunction strips leading and trailing white-space from a string, replaces sequences of whitespace characters by a single space, and returns the resulting string. [Mozilla Developer Network]
来源:https://stackoverflow.com/questions/36663142/xpath-to-get-data-starts-with-specific-character-or-string