问题
I have to
- update a table
Foo
monthly - and another table
Bar
daily - and join these two tables daily and insert the result into a third table
Bazz
Is it possible to configure that
Foo
is updated on certain day (say 5th),- while
Bar
is updated daily - and they are in the same DAG?
回答1:
This behaviour can be achieved within single DAG using either of following alternatives
- ShortCircuitOperator
- AirflowSkipException (better in my opinion)
Basically, your DAG would still run each day (schedule_interval='@daily'
), but
- on a daily basis, only your
Bar
task would run whileFoo
would get skipped (or short-circuited); - until on some particular day (like 5th of each month) when both would run.
You can, of course, also model these as separate DAGs and chain them together (rather than individual tasks within a single DAG). This choice might be better till the number of DAGs that you are linking together is small.
来源:https://stackoverflow.com/questions/57104547/how-to-define-a-dag-that-scheduler-a-monthly-job-together-with-a-daily-job