Capture scrapy spider running status using an already defined decorator

假如想象 提交于 2020-05-17 10:09:21

问题


So I have a custom decorator called task that captures the status of a function. e.g.,

@task(task_name='tutorial',
      alert_name='tutorial')
def start():
    raw_data = download_data()
    data = parse(raw_data)
    push_to_db(data)

if if __name__ == "__main__":
    start()

So here the task decorator monitors the status of start function and send the error message to a central monitor system using alert_name if it fails otherwise sends successful message if it succeeds.

Now I want to add this decorator to scrapy spiders to capture the status. But I do not know where this should go as the spider entry point is not unknown when using this command to start the spider:

$scrapy crawl tutorial

I have tried CrawlerRunner inside spider py file. It goes like this:

@task(task_name='tutorial',
      alert_name='tutorial')
def start():
    runner = CrawlerRunner()
    runner.crawls(TutorialSpider)

if __name__ == "__main__":
    start()

There are two problem with this:

  1. Even if TutorialSpider fails, task still gets a successful message. It seems like task can only capture the status of runner.crawls which isolates spider error away from the decorator.
  2. CrawlerRunner is not really meant for this from my perspective. It should be used for starting multiple spiders at the same time. I feel something's wrong when using it this way.

So in summary I have two questions:

  1. Where should I put this task decorator so that it captures the status of scrapy spiders?
  2. Is there a central place that I can add this decorator by default for all the spiders upon generating new spider using scrapy genspider command? I will have over 100 spiders in the future. Adding decorator for each one would be cumbersome and hard to maintain. Ideally, all I need to do is to provide task_name and alert_name as arguments when starting spiders.

Thank you so much for taking your time reading through this question and offering help.

来源:https://stackoverflow.com/questions/59458037/capture-scrapy-spider-running-status-using-an-already-defined-decorator

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!