I have a scrapy spider which crawls a site that reloads content via javascript on the page. In order to move to the next page to scrape, I have been using Selenium to click
The problem is that you are reusing HtmlXPathSelector
that was defined for the initial response. Redefine it from selenium browser source_code
:
...
for month in months:
link = self.br.find_element_by_link_text(month)
link.click()
time.sleep(5)
hxs = HtmlXPathSelector(self.br.page_source)
# Get all the divs containing info to be scraped.
listitems = hxs.select("//div[@class='listItem']")
...