I don\'t have a specific code issue I\'m just not sure how to approach the following problem logistically with the Scrapy framework:
The structure of the data I want
You can also use Python functools.partial
to pass an item
or any other serializable data via additional arguments to the next Scrapy callback.
Something like:
import functools
# Inside your Spider class:
def parse(self, response):
# ...
# Process the first response here, populate item and next_url.
# ...
callback = functools.partial(self.parse_next, item, someotherarg)
return Request(next_url, callback=callback)
def parse_next(self, item, someotherarg, response):
# ...
# Process the second response here.
# ...
return item