问题
I've a list of tuples in the form (id,url) I need to crawl a product from a list of urls, and when those products are crawled i need to store them in database under their id.
problem is i can't understand how to pass id to parse function so that i can store crawled item under their id.
回答1:
Initialize start urls in start_requests() and pass id
in meta:
class MySpider(Spider):
mapping = [(1, 'my_url1'), (2, 'my_url2')]
...
def start_requests(self):
for id, url in self.mapping:
yield Request(url, callback=self.parse_page, meta={'id': id})
def parse_page(self, response):
id = response.meta['id']
来源:https://stackoverflow.com/questions/23696934/pass-extra-values-along-with-urls-to-scrapy-spider