How to parse json string in airflow template

左心房为你撑大大i 提交于 2019-12-06 13:19:00

You can add a custom Jinja filter to your DAG with the parameter user_defined_filters to parse the json.

a dictionary of filters that will be exposed in your jinja templates. For example, passing dict(hello=lambda name: 'Hello %s' % name) to this argument allows you to {{ 'world' | hello }} in all jinja templates related to this DAG.

dag = DAG(
    ...
    user_defined_filters={'fromjson': lambda s: json.loads(s)},
)

t1 = SimpleHttpOperator(
    task_id='job',
    xcom_push=True,
    ...
)

t2 = HttpSensor(
    endpoint='job/{{ (ti.xcom_pull("job") | fromjson)["jobId"] }}',
    ...
)

However, it may be cleaner to just write your own custom JsonHttpOperator plugin (or add a flag to SimpleHttpOperator) that parses the JSON before returning so that you can just directly reference {{ti.xcom_pull("job")["jobId"] in the template.

class JsonHttpOperator(SimpleHttpOperator):

    def execute(self, context):
        text = super(JsonHttpOperator, self).execute(context)
        return json.loads(text)

Alternatively, it is also possible to add the json module to the template by doing and the json will be available for usage inside the template. However, it is probably a better idea to create a plugin like Daniel said.

dag = DAG(
    'dagname',
    default_args=default_args,
    schedule_interval="@once",
    user_defined_macros={
        'json': json
    }
)

then

finish_job = HttpSensor(
    task_id="finish_job",
    endpoint="kue/job/{{ json.loads(ti.xcom_pull('job'))['jobId'] }}",
    response_check=lambda response: True if response.json()['state'] == "complete" else False,
    poke_interval=5,
    dag=dag
)
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!