Terminate Scrapy if a condition is met

大兔子大兔子 提交于 2019-12-21 02:52:26

问题


I have written a scraper using scrapy in python. It contains 100 start_urls.

I want to terminate the scraping process once a condition is met. ie terminate scraping of a particular div is found. By terminate I mean it should stop scraping all the urls .

Is it possible


回答1:


What you're looking for is the CloseSpider exception.

Add the following line somewhere at the top of your source file:

from scrapy.exceptions import CloseSpider

And when you detect that your termination condition is met, simply do something like

        raise CloseSpider('termination condition met')

in your callback method (instead of returning or yielding an Item or Request).

Note that requests that are still in progress (HTTP request sent, response not yet received) will still be parsed. No new request will be processed though.



来源:https://stackoverflow.com/questions/23884743/terminate-scrapy-if-a-condition-is-met

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!