how to process all kinds of exception in a scrapy project, in errback and callback?

前端未结

关注

 2  1769

北恋 2020-12-14 12:02

I am currently working on a scraper project which is much important to ensure EVERY request got properly handled, i.e., either to log an error or to save a successful result

2条回答

我在风中等你 (楼主)

2020-12-14 12:33
At first, I thought it's more "logical" to raise exceptions in the parsing callback and process them all in errback, this could make the code more readable. But I tried only to find out errback can only trap errors in the downloader module, such as non-200 response statuses. If I raise a self-implemented ParseError in the callback, the spider just raises it and stops.

Yes, you are right - callback and errback are meant to be used only with downloader, as twisted is used for downloading a resource, and twisted uses deffereds - that's why callbacks are needed.

The only async part in scrapy usually is downloader, all the other parts working synchronously.

So, if you want to catch all non-downloader errors - do it yourself:
- make a big try/except in the callback
- or make a decorator for your callbacks which will do this (i like this idea more)
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...