问题
I have a series of <p> elements inside a document I'm scraping with scrapy.
some of the are:
<p><span>bla bla bla</span></p>
or
<p><span><span>bla bla bla</span><span>second bla bla</span></span></p>
I want to extract all the text with the children (assume I already have the selector of the <p)
(second example: to have a string bla bla bla second bla bla)
回答1:
you can just use //text() to extract all text from children nodes
for example:
.//p//text()
来源:https://stackoverflow.com/questions/26564843/scrapy-get-the-entire-text-including-children