Airflow Scheduler Not Respecting EndTime with datetime.now()+timedelta()

末鹿安然 提交于 2021-01-02 02:48:04

问题


I am trying schedule a dag to run every x seconds. I put the start time as a past date with catchup = False and end time as few seconds into the future.

Although the dag starts as expected, it does not end and goes on forever.

The dag ends if I use an absolute end time like datetime(2019,9,26) but not with datetime.now()+timedelta(seconds=100)

start_date = datetime(2019, 1, 1)
end_date = datetime.now()+timedelta(seconds=200)

default_args = {
    "owner": "airflow",
    "depends_on_past": True,
    "start_date": start_date,
    "end_date": end_date
}

dag = DAG("file_dag", catchup=False, default_args=default_args, schedule_interval=timedelta(seconds=20), max_active_runs=1)

I expect the dag to stop executing after may be 10 or 11 runs depending on when it started. But it keeps executing even after 20 runs and does not seem to stop.


回答1:


You cannot / must not use datetime.now() in start_date and end_date expressions


The behaviour that you are observing is pretty obvious:

  • Recall that dag-definition files are parsed continuously in background. Section [6] Restrict the number of Airflow variables in your DAG in Airflow: Lesser Known Tips, Tricks and Best Practices says

    Your DAG files are parsed every X seconds

  • On each cycle of parsing of your dag-definition file, the end_date gets updated to 200 seconds after current time. Since parsing of dag-definition-file(s) goes on forever, the end_date keeps shifting and you get a never-ending dag



来源:https://stackoverflow.com/questions/58105197/airflow-scheduler-not-respecting-endtime-with-datetime-nowtimedelta

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!