Given this html:
- This is a link
- This is another link
@Tomalak is correct in saying that XPath generally cannot select that which is not there.
However, in this case, the results you want are the string values of li elements. As you've found,
string(//ul/li)
gets you close but only returns the first desired string.
This points to a shortcoming in XPath 1.0 that was addressed in XPath 2.0.
In XPath 1.0, you have to iterate over the nodeset selected by //ul/li outside of XPath -- in XSLT, Python, Java, etc.
In XPath 2.0, the last location step can be a function, so you can use,
//ul/li/string()
to directly return
This is a link
This is another link.
as requested.
This is more educational than practical if you're stuck with Scrapy, which only supports XPath 1.0, but knowing
string(),is generally helpful in reasoning about XPath text selections.