execution_date in airflow: need to access as a variable

自闭症网瘾萝莉.ら 提交于 2019-11-28 05:15:52
Erik Schuchmann

The BashOperator's bash_command argument is a template. You can access execution_date in any template as a datetime object using the execution_date variable. In the template, you can use any jinja2 methods to manipulate it.

Using the following as your BashOperator bash_command string:

# pass in the first of the current month
some_command.sh {{ execution_date.replace(day=1) }}

# last day of previous month
some_command.sh {{ execution_date.replace(day=1) - macros.timedelta(days=1) }}

If you just want the string equivalent of the execution date, ds will return a datestamp (YYYY-MM-DD), ds_nodash returns same without dashes (YYYYMMDD), etc. More on macros is available in the Api Docs.


Your final operator would look like:

command = """curl -XPOST '%(hostname)s:8000/run?st={{ ds }}'""" % locals()
t1 = BashOperator( task_id='rest-api-1', bash_command=command, dag=dag)

The PythonOperator constructor takes a 'provide_context' parameter (see https://pythonhosted.org/airflow/code.html). If it's True, then it passes a number of parameters into the python_callable via kwargs. kwargs['execution_date'] is what you want, I believe.

Something like this:

def python_method(ds, **kwargs):
    Variable.set('execution_date', kwargs['execution_date'])
    return

doit = PythonOperator(
    task_id='doit',
    provide_context=True,
    python_callable=python_method,
    dag=dag)

I'm not sure how to do it with the BashOperator, but you might start with this issue: https://github.com/airbnb/airflow/issues/775

Babcool

I think you can't assign variables with values from the airflow context outside of a task instance, they are only available at run-time. Basically there are 2 different steps when a dag is loaded and executed in airflow :

  • First your dag file is interpreted and parsed. It has to work and compile and the task definitions must be correct (no syntax error or anything). During this step, if you make function calls to fill some values, these functions won't be able to access airflow context (the execution date for example, even more if you're doing some backfilling).

  • The second step is the execution of the dag. It's only during this second step that the variables provided by airflow (execution_date, ds, etc...) are available as they are related to an execution of the dag.

So you can't initialize global variables using the Airflow context, however, Airflow gives you multiple mechanisms to achieve the same effect :

  1. Using jinja template in your command (it can be in a string in the code or in a file, both will be processed). You have the list of available templates here : https://airflow.apache.org/macros.html#default-variables. Note that some functions are also available, particularly for computing days delta and date formatting.

  2. Using a PythonOperator in which you pass the context (with the provide_context argument). This will allow you to access the same template with the syntax kwargs['<variable_name']. If you need so, you can return a value from a PythonOperator, this one will be stored in an XCOM variable you can use later in any template. Access to XCOM variables use this syntax : https://airflow.apache.org/concepts.html#xcoms

  3. If you write your own operator, you can access airflow variables with the dict context.

def execute(self, context):
    execution_date = context.get("execution_date")

This should be inside the execute() method of Operator

Aditi Srivastava

To print execution date inside the callable function of your PythonOperator you can use the following in your Airflow Script and also can add start_time and end_time as follows:

def python_func(**kwargs):
    ts = kwargs["execution_date"]
    end_time = str(ts)
    start_time = str(ts.add(minutes=-30))

I have converted the datetime value to string as I need to pass it in a SQL Query. We can use it otherwise also.

You may consider SimpleHttpOperator https://airflow.apache.org/_api/airflow/operators/http_operator/index.html#airflow.operators.http_operator.SimpleHttpOperator. It’s so simple for making http request. you can pass execution_date with endpoint parameter via template.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!