问题
I am trying to pass the following configuration parameters to Airflow CLI while triggering a dag run. Following is the trigger_dag command I am using.
airflow trigger_dag -c '{"account_list":"[1,2,3,4,5]", "start_date":"2016-04-25"}' insights_assembly_9900
My problem is that how can I access the con parameters passed inside an operator in the dag run.
回答1:
This is probably a continuation of the answer provided by devj
.
At
airflow.cfg
the following property should be set to true:dag_run_conf_overrides_params=True
While defining the PythonOperator, pass the following argument
provide_context=True
. For example:
get_row_count_operator = PythonOperator(task_id='get_row_count', python_callable=do_work, dag=dag, provide_context=True)
- Define the python callable (Note the use of
**kwargs
):
def do_work(**kwargs): table_name = kwargs['dag_run'].conf.get('table_name') # Rest of the code
- Invoke the dag from command line:
airflow trigger_dag read_hive --conf '{"table_name":"my_table_name"}'
I have found this discussion to be helpful.
回答2:
There are two ways in which one can access the params passed in airflow trigger_dag
command.
In the callable method defined in PythonOperator, one can access the params as
kwargs['dag_run'].conf.get('account_list')
given the field where you are using this thing is templatable field, one can use
{{ dag_run.conf['account_list'] }}
The schedule_interval
for the externally trigger-able DAG is set as None
for the above approaches to work
回答3:
Firstly, import this somewhere
from airflow.configuration import conf
conf.set("core", "my_key", "my_val")
Secondly, get the value somewhere
conf.get("core", "my_key")
来源:https://stackoverflow.com/questions/43652192/accessing-configuration-parameters-passed-to-airflow-through-cli