Capture scrapy spider running status using an already defined decorator

问题

So I have a custom decorator called task that captures the status of a function. e.g.,

@task(task_name='tutorial',
      alert_name='tutorial')
def start():
    raw_data = download_data()
    data = parse(raw_data)
    push_to_db(data)

if if __name__ == "__main__":
    start()

So here the task decorator monitors the status of start function and send the error message to a central monitor system using alert_name if it fails otherwise sends successful message if it succeeds.

Now I want to add this decorator to scrapy spiders to capture the status. But I do not know where this should go as the spider entry point is not unknown when using this command to start the spider:

$scrapy crawl tutorial

I have tried CrawlerRunner inside spider py file. It goes like this:

@task(task_name='tutorial',
      alert_name='tutorial')
def start():
    runner = CrawlerRunner()
    runner.crawls(TutorialSpider)

if __name__ == "__main__":
    start()

There are two problem with this:

Even if TutorialSpider fails, task still gets a successful message. It seems like task can only capture the status of runner.crawls which isolates spider error away from the decorator.
CrawlerRunner is not really meant for this from my perspective. It should be used for starting multiple spiders at the same time. I feel something's wrong when using it this way.

So in summary I have two questions:

Where should I put this task decorator so that it captures the status of scrapy spiders?
Is there a central place that I can add this decorator by default for all the spiders upon generating new spider using scrapy genspider command? I will have over 100 spiders in the future. Adding decorator for each one would be cumbersome and hard to maintain. Ideally, all I need to do is to provide task_name and alert_name as arguments when starting spiders.

Thank you so much for taking your time reading through this question and offering help.

来源：https://stackoverflow.com/questions/59458037/capture-scrapy-spider-running-status-using-an-already-defined-decorator

标签

python

scrapy

decorator

python-decorators