问题
I'm trying to pass a value from a function.
i looked up the docs and just didn't understand it. ref:
def parse_page1(self, response):
item = MyItem()
item['main_url'] = response.url
request = scrapy.Request("http://www.example.com/some_page.html",
callback=self.parse_page2)
request.meta['item'] = item
yield request
def parse_page2(self, response):
item = response.meta['item']
item['other_url'] = response.url
yield item
here is a psudo code of what i want to achive:
import scrapy
class GotoSpider(scrapy.Spider):
name = 'goto'
allowed_domains = ['first.com', 'second.com]
start_urls = ['http://first.com/']
def parse(self, response):
name = response.xpath(...)
price = scrapy.Request(second.com, callback = self.parse_check)
yield(name, price)
def parse_check(self, response):
price = response.xpath(...)
return price
回答1:
This is how you can pass any value, link etc to other methods:
import scrapy
class GotoSpider(scrapy.Spider):
name = 'goto'
allowed_domains = ['first.com', 'second.com']
start_urls = ['http://first.com/']
def parse(self, response):
name = response.xpath(...)
link = response.xpath(...) # link for second.com where you may find the price
request = scrapy.Request(url=link, callback = self.parse_check)
request.meta['name'] = name
yield request
def parse_check(self, response):
name = response.meta['name']
price = response.xpath(...)
yield {"name":name,"price":price} #Assuming that in your "items.py" the fields are declared as name, price
来源:https://stackoverflow.com/questions/46258343/scrapy-getting-values-from-multiple-sites