Scraping with Scrapy and Selenium

后端 未结 1 1565
忘掉有多难
忘掉有多难 2020-12-15 13:25

I have a scrapy spider which crawls a site that reloads content via javascript on the page. In order to move to the next page to scrape, I have been using Selenium to click

相关标签:
1条回答
  • 2020-12-15 13:48

    The problem is that you are reusing HtmlXPathSelector that was defined for the initial response. Redefine it from selenium browser source_code:

    ...
    for month in months:
        link = self.br.find_element_by_link_text(month)
        link.click()
        time.sleep(5)
    
        hxs = HtmlXPathSelector(self.br.page_source)
    
        # Get all the divs containing info to be scraped.
        listitems = hxs.select("//div[@class='listItem']")
    ...
    
    0 讨论(0)
提交回复
热议问题