airflow

Accessing Kubernetes Secret from Airflow KubernetesPodOperator

非 Y 不嫁゛ 提交于 2019-12-25 00:27:19
问题 I'm setting up an Airflow environment on Google Cloud Composer for testing. I've added some secrets to my namespace, and they show up fine: $ kubectl describe secrets/eric-env-vars Name: eric-env-vars Namespace: eric-dev Labels: <none> Annotations: <none> Type: Opaque Data ==== VERSION_NUMBER: 6 bytes I've referenced this secret in my DAG definition file (leaving out some code for brevity): env_var_secret = Secret( deploy_type='env', deploy_target='VERSION_NUMBER', secret='eric-env-vars', key

How to pause a task in airflow

南笙酒味 提交于 2019-12-24 22:36:21
问题 There is a DAG which contains 4 tasks as shown below. Sometimes I want to run task3 only after checking couple of things. May I know how to pause task3 of next instance. Is there any way to pause future(tomorrow's) task instance ? I know we can use command line interface "airflow pause dagid" but here I want to pause taskid. task1 >> task2 >> task3 >> task4 I know we can use command line interface "airflow pause dagid" but here I want to pause taskid. 来源: https://stackoverflow.com/questions

How to read dynamic argument airflow operator?

落花浮王杯 提交于 2019-12-24 20:49:07
问题 I am new in python and airflow dag. I am following below link and code which is mention in answer section. How to pass dynamic arguments Airflow operator? I am facing issue to reading yaml file, In yaml file I have some configuration related arguments. configs: cluster_name: "test-cluster" project_id: "t***********" zone: "europe-west1-c" num_workers: 2 worker_machine_type: "n1-standard-1" master_machine_type: "n1-standard-1" In DAG script I have created one task which will be create cluster,

catchup = False, why still two schedule runs are scheduled?

本小妞迷上赌 提交于 2019-12-24 19:23:06
问题 I've simple DAG: (Airflow v1.10.16, using SequentialExecutor on localhost machine) start_date set in past catchup = False default_args = {'owner': 'test_user', 'start_date': datetime(2019, 12, 1, 1, 00, 00),} graph1 = DAG(dag_id = 'test_dag', default_args=default_args, schedule_interval=timedelta(days=1), catchup = False) t = PythonOperator(task_id='t', python_callable=my_func, dag=graph1) as per code comments :param catchup: Perform scheduler catchup (or only run latest)? I expected when the

How to implement polling in Airflow?

两盒软妹~` 提交于 2019-12-24 18:41:29
问题 I want to use Airflow to implement data flows that periodically poll external systems (ftp servers, etc), check for new files matching certain conditions, and then run a bunch of tasks for those files. Now, I'm a newbie to Airflow and read that Sensors are something you would use for this kind of a case, and I actually managed to write a sensor that works ok when I run "airflow test" for it. But I'm a bit confused regarding the relation of poke_interval for the sensor and the DAG scheduling.

Airflow Python Operator with a. return type

左心房为你撑大大i 提交于 2019-12-24 12:17:24
问题 I have a python operator in my DAG. The python callable function is returning a bool value. But, when I run the DAG, I get the below error. TypeError: 'bool' object is not callable I modified the function to return nothing but then again I keep getting the below error ERROR - 'NoneType' object is not callable Below is my dag def check_poke(threshold,sleep_interval): flag=snowflake_poke(1000,10).poke() #print(flag) return flag dependency = PythonOperator( task_id='poke_check', #python_callable

Airflow - Proper way to handle DAGs callbacks

戏子无情 提交于 2019-12-24 11:12:16
问题 I have a DAG and then whenever it success or fails, I want it to trigger a method which posts to Slack. My DAG args is like below: default_args = { [...] 'on_failure_callback': slack.slack_message(sad_message), 'on_success_callback': slack.slack_message(happy_message), [...] } And the DAG definition itself: dag = DAG( dag_id = dag_name_id, default_args=default_args, description='load data from mysql to S3', schedule_interval='*/10 * * * *', catchup=False ) But when I check Slack there is more

Airflow Packaged Dag (Zip) not recognized

对着背影说爱祢 提交于 2019-12-24 10:32:20
问题 I am trying to package my Repository with my Dag in a Zip file like it states here in the documentation. So i have followed the convention in the documentation, which is to keep the dag in the root of the zip, and the sub directories are viewed as packages by airflow. My zip file has the following contents: $ unzip -l $AIRFLOW_HOME/dags/test_with_zip.zip Archive: /home/arjunc/Tutorials/airflow/dags/test_with_zip.zip Length Date Time Name --------- ---------- ----- ---- 0 2018-03-29 17:46

airflow initdb failed: ImportError: No module named log.logging_mixin

柔情痞子 提交于 2019-12-24 10:10:28
问题 I had airflow 1.7.0 installed on a machine where I don't have root access. Everything is installed in /apps/dist/ for which I'm the owner Ran $ pip install apache-airflow I had a lot of success, until this: Installing collected packages: webencodings, html5lib, bleach, configparser, flask-wtf, future, gunicorn, apache-airflow Found existing installation: Flask-WTF 0.12 Uninstalling Flask-WTF-0.12: Successfully uninstalled Flask-WTF-0.12 Found existing installation: future 0.15.2 Uninstalling

airflow initdb failed: ImportError: No module named log.logging_mixin

匆匆过客 提交于 2019-12-24 10:06:06
问题 I had airflow 1.7.0 installed on a machine where I don't have root access. Everything is installed in /apps/dist/ for which I'm the owner Ran $ pip install apache-airflow I had a lot of success, until this: Installing collected packages: webencodings, html5lib, bleach, configparser, flask-wtf, future, gunicorn, apache-airflow Found existing installation: Flask-WTF 0.12 Uninstalling Flask-WTF-0.12: Successfully uninstalled Flask-WTF-0.12 Found existing installation: future 0.15.2 Uninstalling