Python Airflow - Return result from PythonOperator

后端 未结 1 936
-上瘾入骨i
-上瘾入骨i 2020-12-08 08:47

I have written a DAG with multiple PythonOperators

task1 = af_op.PythonOperator(task_id=\'Data_Extraction_Environment\',
                          provide_c         


        
相关标签:
1条回答
  • 2020-12-08 09:14

    You might want to check out Airflow's XCOM: https://airflow.apache.org/concepts.html#xcoms

    If you return a value from a function, this value is stored in xcom. In your case, you could access it like so from other Python code:

    task_instance = kwargs['task_instance']
    task_instance.xcom_pull(task_ids='Task1')
    

    or in a template like so:

    {{ task_instance.xcom_pull(task_ids='Task1') }}
    

    If you want to specify a key you can push into XCOM (being inside a task):

    task_instance = kwargs['task_instance']
    task_instance.xcom_push(key='the_key', value=my_str)
    

    Then later on you can access it like so:

    task_instance.xcom_pull(task_ids='my_task', key='the_key')
    

    EDIT 1

    Follow-up question: Instead of using the value in another function how can i pass the value to another PythonOperator like - "t2 = "BashOperator(task_id='Moving_bucket', bash_command='python /home/raw.py "%s" '%file_name, dag=dag)" --- i want to access file_name which is returned by "Task1". How can this will be acheived?

    First of all, it seems to me that the value is, in fact, not being passed to another PythonOperator but to a BashOperator.

    Secondly, this is already covered in my answer above. The field bash_command is templated (see template_fields in the source: https://github.com/apache/incubator-airflow/blob/master/airflow/operators/bash_operator.py). Hence, we can use the templated version:

    BashOperator(
      task_id='Moving_bucket', 
      bash_command='python /home/raw.py {{ task_instance.xcom_pull(task_ids='Task1') }} ',
      dag=dag,
    )
    

    EDIT 2

    Explanation: Airflow works like this: It will execute Task1, then populate xcom and then execute the next task. So for your example to work you need Task1 executed first and then execute Moving_bucket downstream of Task1.

    Since you are using a return function, you could also omit the key='file' from xcom_pull and not manually set it in the function.

    0 讨论(0)
提交回复
热议问题