Pass extra values along with urls to scrapy spider

女生的网名这么多〃 提交于 2019-12-06 01:26:54

问题


I've a list of tuples in the form (id,url) I need to crawl a product from a list of urls, and when those products are crawled i need to store them in database under their id.

problem is i can't understand how to pass id to parse function so that i can store crawled item under their id.


回答1:


Initialize start urls in start_requests() and pass id in meta:

class MySpider(Spider):
    mapping = [(1, 'my_url1'), (2, 'my_url2')]

    ...

    def start_requests(self):
        for id, url in self.mapping:
            yield Request(url, callback=self.parse_page, meta={'id': id})

    def parse_page(self, response):
        id = response.meta['id']


来源:https://stackoverflow.com/questions/23696934/pass-extra-values-along-with-urls-to-scrapy-spider

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!