Scrapy: Passing item between methods

三世轮回 提交于 2019-12-18 11:08:13

问题


Suppose I have a Bookitem, I need to add information to it in both the parse phase and detail phase

def parse(self, response)
    data = json.loads(response)
    for book in data['result']:
        item = BookItem();
        item['id'] = book['id']
        url = book['url']
        yield Request(url, callback=self.detail)

def detail(self,response):        
    hxs = HtmlXPathSelector(response)
    item['price'] = ......
#I want to continue the same book item as from the for loop above

Using the code as is would led to undefined item in the detail phase. How can I pass the item to the detail? detail(self,response,item) doesn't seem to work.


回答1:


There is an argument named meta for Request:

yield Request(url, callback=self.detail, meta={'item': item})

then in function detail, access it this way:

item = response.meta['item']

See more details here about jobs topic.




回答2:


You can define variable in init method:

class MySpider(BaseSpider):
    ...

    def __init__(self):
        self.item = None

    def parse(self, response)
        data = json.loads(response)
        for book in data['result']:
            self.item = BookItem();
            self.item['id'] = book['id']
            url = book['url']
            yield Request(url, callback=self.detail)

    def detail(self, response):        
        hxs = HtmlXPathSelector(response)
        self.item['price'] = ....


来源:https://stackoverflow.com/questions/20663162/scrapy-passing-item-between-methods

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!