Python Airflow - Return result from PythonOperator

烂漫一生 提交于 2019-11-28 07:52:20

You might want to check out Airflow's XCOM: https://airflow.apache.org/concepts.html#xcoms

If you return a value from a function, this value is stored in xcom. In your case, you could access it like so from other Python code:

task_instance = kwargs['task_instance']
task_instance.xcom_pull(task_ids='Task1')

or in a template like so:

{{ task_instance.xcom_pull(task_ids='Task1') }}

If you want to specify a key you can push into XCOM (being inside a task):

task_instance = kwargs['task_instance']
task_instance.xcom_push(key='the_key', value=my_str)

Then later on you can access it like so:

task_instance.xcom_pull(task_ids='my_task', key='the_key')

EDIT 1

Follow-up question: Instead of using the value in another function how can i pass the value to another PythonOperator like - "t2 = "BashOperator(task_id='Moving_bucket', bash_command='python /home/raw.py "%s" '%file_name, dag=dag)" --- i want to access file_name which is returned by "Task1". How can this will be acheived?

First of all, it seems to me that the value is, in fact, not being passed to another PythonOperator but to a BashOperator.

Secondly, this is already covered in my answer above. The field bash_command is templated (see template_fields in the source: https://github.com/apache/incubator-airflow/blob/master/airflow/operators/bash_operator.py). Hence, we can use the templated version:

BashOperator(
  task_id='Moving_bucket', 
  bash_command='python /home/raw.py {{ task_instance.xcom_pull(task_ids='Task1') }} ',
  dag=dag,
)

EDIT 2

Explanation: Airflow works like this: It will execute Task1, then populate xcom and then execute the next task. So for your example to work you need Task1 executed first and then execute Moving_bucket downstream of Task1.

Since you are using a return function, you could also omit the key='file' from xcom_pull and not manually set it in the function.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!