问题
I am trying to scrape data from the following page using the lxml module in Python: http://www.thehindu.com/todays-paper/with-afspa-india-has-failed-statute-amnesty/article7376286.ece. I want to get the text in the first paragraph, but the following code is returning null value
from lxml import html
import requests
page = requests.get('http://www.thehindu.com/todays-paper/with-afspa-india-has-failed-statute-amnesty/article7376286.ece')
tree = html.fromstring(page.text)
data = tree.xpath('//*[@id="left-column"]/div[6]/p[1]/text()')
print data
I don't understand what I'm doing wrong here. Please suggest if there are better ways of doing what I'm trying to do.
回答1:
Try //div[class='article-text']/p/text()
回答2:
you can use xpath as follow :
div[@class='article-text']/p[1]/text()
来源:https://stackoverflow.com/questions/31321709/python-xpath-query-not-returning-text-value