问题
Lets say scheduler is stopped for 5 hours and I had dag scheduled for twice every hour. Now when I restart the scheduler I do not want to airflow to backfill all the instances those were missed, Instead I want it to continue from the current hour.
回答1:
To achieve this behavior, you can use the LatestOnlyOperator
, which was just recently introduced to master, to the start of your DAG. It is not currently part of a released version though (1.7.1.3 is the latest version as of the writing of this post).
回答2:
I'm sure you're no longer waiting for an answer, but for reference, this is covered here: https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls.
"When needing to change your start_date and schedule interval, change the name of the dag (a.k.a. dag_id) - I follow the convention : my_dag_v1, my_dag_v2, my_dag_v3, my_dag_v4, etc..."
来源:https://stackoverflow.com/questions/36098572/how-to-deploy-modified-airflow-dag-from-a-different-start-time