airflow

How to use airflow xcoms with MySqlOperator

我的梦境 提交于 2019-12-05 05:55:23
def mysql_operator_test(): DEFAULT_DATE = datetime(2017, 10, 9) t = MySqlOperator( task_id='basic_mysql', sql="SELECT count(*) from table 1 where id>100;", mysql_conn_id='mysql_default', dag=dag) t.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE, ignore_ti_state=False) run_this = PythonOperator( task_id='getRecoReq', python_callable=mysql_operator_test, # xcom_push=True, dag=dag) task2 = PythonOperator( task_id= 'mysql_select', provide_context=True, python_callable = blah, templates_dict = {'requests': "{{ ti.xcom_pull(task_ids='getReq') }}" }, dag=dag) run_this.set_downstream(task2) I want

TemplateNotFound when using Airflow's PostgresOperator with Jinja templating and SQL

时光总嘲笑我的痴心妄想 提交于 2019-12-05 05:49:48
When trying to use Airflow's templating capabilities (via Jinja2) with the PostgresOperator, I've been unable to get things to render. It's quite possible I'm doing something wrong, but I'm pretty lost as to what the issue might be. Here's an example to reproduce the TemplateNotFound error I've been getting: airflow.cfg airflow_home = /home/gregreda/airflow dags_folder = /home/gregreda/airflow/dags relevant DAG and variables default_args = { 'owner': 'gregreda', 'start_date': datetime(2016, 6, 1), 'schedule_interval': None, 'depends_on_past': False, 'retries': 3, 'retry_delay': timedelta

How to work correctly airflow schedule_interval

微笑、不失礼 提交于 2019-12-05 05:37:29
I want to try to use Airflow instead of Cron. But schedule_interval doesn't work as I expected. I wrote the python code like below. And in my understanding, Airflow should have ran on "2016/03/30 8:15:00" but it didn't work at that time. If I changed it like this "'schedule_interval': timedelta(minutes = 5)", it worked correctly, I think. The "notice_slack.sh" is just to call slack api to my channels. # -*- coding: utf-8 -*- from __future__ import absolute_import, unicode_literals import os from airflow.operators import BashOperator from airflow.models import DAG from datetime import datetime,

Airflow will keep showing example dags even after removing it from configuration

♀尐吖头ヾ 提交于 2019-12-05 05:02:44
Airflow example dags remain in the UI even after I have turned off load_examples = False in config file. The system informs the dags are not present in the dag folder but they remain in UI because the scheduler has marked it as active in the metadata database. I know one way to remove them from there would be to directly delete these rows in the database but off course this is not ideal.How should I proceed to remove these dags from UI? There is currently no way of stopping a deleted DAG from being displayed on the UI except manually deleting the corresponding rows in the DB. The only other

How to get the JobID for the airflow dag runs?

梦想的初衷 提交于 2019-12-05 04:57:18
When we do a dagrun, on the Airflow UI, in the "Graph View" we get details of each job run. JobID is something like "scheduled__2017-04-11T10:47:00" . I need this JobID for tracking and log creation in which I maintain time each task/dagrun took. So my question is how can i get the JobID within the same dag that is being run . Thanks,Chetan This value is actually called run_id and can be accessed via the context or macros. In the python operator this is accessed via context, and in the bash operator this is accessed via jinja templating on the bash_command field. More info on what's available

airflow删除dag不在页面显示

一世执手 提交于 2019-12-05 03:56:10
当我们需要把dag删除的时候,遇到了删除了相应的dag文件,但页面还是显示 这个时候需要重启airflow 的webserver ps -ef|egrep 'scheduler|airflow-webserver'|grep -v grep|awk '{print $2}'|xargs kill -9 rm -rf /home/airflow/airflow/airflow-scheduler.pid airflow webserver -p 8080 -D //后台启动webserver airflow scheduler -D //后台启动scheduler tail -f /home/airflow/airflow/airflow-scheduler.err 来源: https://www.cnblogs.com/braveym/p/11904085.html

How to wait for an asynchronous event in a task of a DAG in a workflow implemented using Airflow?

岁酱吖の 提交于 2019-12-05 02:47:11
My workflow implemented using Airflow contains tasks A, B, C, and D. I want the workflow to wait at task C for an event. In Airflow sensors are used to check for some condition by polling for some state, if that condition is true then the next task in the workflow gets triggered. My requirement is to avoid polling. Here one answer mentions about a rest_api_plugin of airflow which creates rest_api endpoint to trigger airflow CLI - using this plugin I can trigger a task in the workflow. In my workflow, however, I want to implement a task that waits for a rest API call(async event) without

External files in Airflow DAG

不羁岁月 提交于 2019-12-05 02:09:36
I'm trying to access external files in a Airflow Task to read some sql, and I'm getting "file not found". Has anyone come across this? from airflow import DAG from airflow.operators.python_operator import PythonOperator from datetime import datetime, timedelta dag = DAG( 'my_dat', start_date=datetime(2017, 1, 1), catchup=False, schedule_interval=timedelta(days=1) ) def run_query(): # read the query query = open('sql/queryfile.sql') # run the query execute(query) tas = PythonOperator( task_id='run_query', dag=dag, python_callable=run_query) The log state the following: IOError: [Errno 2] No

Launch a subdag with variable parallel tasks in airflow

ぃ、小莉子 提交于 2019-12-05 02:09:07
问题 I have an airflow workflow that I'd like to modify (see illustration at the bottom). However, I couldn't find a way to do that in the docs. I've looked at subdags, branching and xcoms without luck. There doesn't seem to be a way to specify how many tasks should run in parallel in a subdag based on a return from an operator. To add to the problem, each task in the subdag receives a different parameter (an element from the list returned by the previous operator) This is an illustration of what

Airflow installation successfull, but unable to run it

て烟熏妆下的殇ゞ 提交于 2019-12-05 01:52:57
C:\Python27\Scripts>airflow initdb 'airflow' is not recognized as an internal or external command, operable program or batch file. C:\Python27\Scripts>airflow init 'airflow' is not recognized as an internal or external command, operable program or batch file. C:\Python27\Scripts>airflow webserver -p 8080 'airflow' is not recognized as an internal or external command, operable program or batch file. I am trying to install in Windows 7 machine and I am using Python 2.7 Jeremy Farrell Airflow doesn't officially support running on Windows. From my limited testing with it I was unable to get it