python scrapy parse() function, where is the return value returned to?

左心房为你撑大大i 提交于 2019-11-30 17:14:51

问题


I am new on Scrapy, and I am sorry if this question is trivial. I have read the document on Scrapy from official webpage. And while I look through the document, I met this example:

import scrapy
from myproject.items import MyItem

class MySpider(scrapy.Spider):
  name = ’example.com’
  allowed_domains = [’example.com’]
  start_urls = [
  ’http://www.example.com/1.html’,
  ’http://www.example.com/2.html’,
  ’http://www.example.com/3.html’,
  ]

  def parse(self, response):
    for h3 in response.xpath(’//h3’).extract():
      yield MyItem(title=h3)
    for url in response.xpath(’//a/@href’).extract():
      yield scrapy.Request(url, callback=self.parse) 

I know, the parse method must return an item or/and request, but where are these return values returned to?

One is an item and the other is request, I think these two type would be handled differently and in the case of CrawlSpider, it has Rule with callback. What about this callback's return value? where to ? same as parse()?

I am very confused on Scrapy procedure, even i read the document....


回答1:


According to the documentation:

The parse() method is in charge of processing the response and returning scraped data (as Item objects) and more URLs to follow (as Request objects).

In other words, returned/yielded items and requests are handled differently, items are being handed to the item pipelines and item exporters, but requests are being put into the Scheduler which pipes the requests to the Downloader for making a request and returning a response. Then, the engine receives the response and gives it to the spider for processing (to the callback method).

The whole data-flow process is described in the Architecture Overview page in a very detailed manner.

Hope that helps.



来源:https://stackoverflow.com/questions/26195982/python-scrapy-parse-function-where-is-the-return-value-returned-to

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!