Can I have tasks under one DAG with different start dates in Airflow?

不羁岁月 提交于 2019-12-08 05:09:52

问题


I have a DAG which runs two tasks: A and B.

Instead of specifying the start_date on DAG level, I have added it as an attribute to the operators (I am using a PythonOperator in this case) and removed it form the DAG dictionary. Both tasks run daily.

The start_date for A is 2013-01-01 and the start_date for B is 2015-01-01. My problem is that Airflow runs for 16 days for tasks A (because I guess in my airflow.cfg I have left the default dag_concurrency = 16)from 2013-01-01 and after that it stops. The DAGs are in state running and the tasks for B are in state with no status.

Clearly I am doing something wrong and I can simply set the start_date on DAG level and have B run from the start_date of A, but that's not what i want to do.

Alternatively I can split them in separate DAGs, but again, that's not how I want to monitor them.

Is there a way to have a DAG with multiple tasks each having its own start_date? If so, how to do this?

UPDATE:

I know that a ShortCircuitOperator can be added, but this seems to work only for a flow of tasks which are dependent and there is a downstream. In my case A is independent of B.


回答1:


Use BranchPythonOperator and check in that task that your execution_date >= '2015-01-01' or not. If true it should execute Task B, if not it should execute a Dummy Task.

However, I would recommend using a Separate DAG.

Documentation on branching: https://airflow.readthedocs.io/en/1.10.2/concepts.html#branching



来源:https://stackoverflow.com/questions/55329782/can-i-have-tasks-under-one-dag-with-different-start-dates-in-airflow

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!