My problematic is the following:
To win time, I would like to run several versions of one single spider. The process (parsing definitions) is the sa
Scrapy supports spider arguments. Weirdly enough there's no straightforward documentation, but I'll try to fill in:
When you run a crawl command you may provide -a NAME=VALUE arguments and these will be set as your spider class instance variables. For example:
class MySpider(Spider):
name = 'arg'
# we will set below when running the crawler
foo = None
bar = None
def start_requests(self):
url = f'http://example.com/{self.foo}/{self.bar}'
yield Request(url)
And if we run it:
scrapy crawl arg -a foo=1 -a bar=2
# will crawl example.com/1/2