Can I programmatically determine if an Airflow DAG was scheduled or manually triggered?

血红的双手。 提交于 2020-05-15 09:37:05

问题


I want to create a snippet that passes the correct date based on whether the DAG was scheduled or whether it was triggered manually. The DAG runs monthly. The DAG generates a report (A SQL query) based on the data of the previous month.

If I run the DAG scheduled, I can fetch the previous month with the following jinja snippet:

execution_date.month

given that the DAG is scheduled at the end of the previous period (last month) the execution_date will correctly return the last month. However on manual runs this will return the current month (execution date will be the date of the manual trigger).

I want to write a simple macro that deals with this case. However I could not find a good way to programmatically query whether the DAG is triggered programmatically. The best I could come up with is to fetch the run_id from the database (by creating a macro that has a DB session), check wheter the run_id contains the word manual. Is there a better way to solve this problem?


回答1:


There is no direct DAG property to identify manual runs for now. To get this information you would need to check the run_id as you mentioned.

However, there is a dedicated macro get the run_id. You don't have to fetch it from the database by yourself. Here is an example on how to use it :

    def some_task_py(**context):
        run_id = context['templates_dict']['run_id']
        is_manual = run_id.startswith('manual__')
        is_scheduled = run_id.startswith('scheduled__')


    some_task = PythonOperator(
                task_id = 'some_task',
                dag=dag,
                templates_dict = {'run_id': '{{ run_id }}'},
                python_callable = some_task_py,
                provide_context = True)



回答2:


tl;dr: You can determine this with DagRun.external_trigger.


I noticed that in the Tree View, there's an outline around runs that are scheduled, but not manual. That's because the latter has stroke-opacity: 0; applied in CSS.

Searching the repo for this, I found how Airflow devs detect manual runs (5 year old line, so should work in older version as well):

.style("stroke-opacity", function(d) {return d.external_trigger ? "0": "1"})

Searching for external_trigger brings us to the DagRun definition.

So if you were using, for example, a Python callback, you can have something like this (can be defined in the DAG, or a separate file):

def my_fun(context):
    if context.get('dag_run').external_trigger:
        print('manual run')
    else:
        print('scheduled run')

and in your Operator set the parameter like:

t1 = BashOperator(
    task_id='print_date',
    bash_command='date',
    on_failure_callback=my_fun,
    dag=dag,
)

I have tested something similar and it works.

I think you can also do something like if if {{ dag_run.external_trigger }}: - but I haven't tested this, and I believe it would only work in that DAG's file.



来源:https://stackoverflow.com/questions/60077575/can-i-programmatically-determine-if-an-airflow-dag-was-scheduled-or-manually-tri

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!