airflow trigger_dag execution_date is the next day, why?

余生颓废 提交于 2019-12-03 05:17:57
mistercrunch

First, I recommend you use constants for start_date, because dynamic ones would act unpredictably based on with your airflow pipeline is evaluated by the scheduler.

More information about start_date here in an FAQ entry that I wrote and sort all this out: https://airflow.apache.org/faq.html#what-s-the-deal-with-start-date

Now, about execution_date and when it is triggered, this is a common gotcha for people onboarding on Airflow. Airflow sets execution_date based on the left bound of the schedule period it is covering, not based on when it fires (which would be the right bound of the period). When running an schedule='@hourly' task for instance, a task will fire every hour. The task that fires at 2pm will have an execution_date of 1pm because it assumes that you are processing the 1pm to 2pm time window at 2pm. Similarly, if you run a daily job, the run an with execution_date of 2016-01-01 would trigger soon after midnight on 2016-01-02.

This left-bound labelling makes a lot of sense when thinking in terms of ETL and differential loads, but gets confusing when thinking in terms of a simple, cron-like scheduler.

Airflow will provide the time in UTC. I am not sure at what timezone you are running the tasks. So make sure you think of UTC timezone and schedule or trigger the jobs accordingly.

Try converting the time you want to trigger to UTC time and trigger the DAG. it works. For more information, you can read the below link

https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!