Airflow Scheduler Misunderstanding

牧云@^-^@ 提交于 2019-12-02 09:18:26

So there is a whole page talking about airflow job not been scheduled. https://airflow.apache.org/faq.html

The key thing to notice here is:

The Airflow scheduler triggers the task soon after the start_date + scheduler_interval is passed.

To my understanding, you want to trigger a task start_date=datetime(year=2019, month=8, day=7) at 15:00 UTC daily. schedule_interval="00 15 * * *" means you would run the task every day at 15:00 UTC. According to the docs, The scheduler triggers your task after start_date + scheduler_interval, so airflow won't trigger it until the next day which is August 8th 2019 15:00:00 UTC. Or you can change the day to 6th. It might be easier to understand this from ETL way: you can only process the data for a given period after it has passed. So August 7th 2019 15:00:00 UTC is your start point, you need to wait until August 8th 2019 15:00:00 UTC to run the task within that given period.

Also, note airflow has execution_data and start_date, you can find more here

schedule_interval="00 15 * * *" start_date=07-08-2019

1st run will be on 08-08-2019 at 3:00 if you created this dag before 3:00 on 7-8-2019

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!