Airflow: PythonOperator: why to include 'ds' arg?

╄→尐↘猪︶ㄣ 提交于 2019-12-04 16:50:50

问题


While defining a function to be later used as a python_callable, why is 'ds' included as the first arg of the function?

For example:

def python_func(ds, **kwargs):
    pass

I looked into the Airflow documentation, but could not find any explanation.


回答1:


This is related to the provide_context=True parameter. As per Airflow documentation,

if set to true, Airflow will pass a set of keyword arguments that can be used in your function. This set of kwargs correspond exactly to what you can use in your jinja templates. For this to work, you need to define **kwargs in your function header.

ds is one of these keyword arguments and represents execution date in format "YYYY-MM-DD". For parameters that are marked as (templated) in the documentation, you can use '{{ ds }}' default variable to pass the execution date. You can read more about default variables here:

https://pythonhosted.org/airflow/code.html?highlight=pythonoperator#default-variables (obsolete)

https://airflow.incubator.apache.org/concepts.html?highlight=python_callable

PythonOperator doesn't have templated parameters, so doing something like

python_callable=print_execution_date('{{ ds }}')

won't work. To print execution date inside the callable function of your PythonOperator, you will have to do it as

def print_execution_date(ds, **kwargs):
    print(ds)

or

def print_execution_date(**kwargs):
    print(kwargs.get('ds'))

Hope this helps.



来源:https://stackoverflow.com/questions/40531952/airflow-pythonoperator-why-to-include-ds-arg

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!