apache-airflow

Airflow: How to push xcom value from PostgreOperator?

不羁岁月 提交于 2019-12-01 05:32:34
问题 I'm using Airflow 1.8.1 and I want to push the result of a sql request from PostgreOperator. Here's my tasks: check_task = PostgresOperator( task_id='check_task', postgres_conn_id='conx', sql="check_task.sql", xcom_push=True, dag=dag) def py_is_first_execution(**kwargs): value = kwargs['ti'].xcom_pull(task_ids='check_task') print 'count ----> ', value if value == 0: return 'next_task' else: return 'end-flow' check_branch = BranchPythonOperator( task_id='is-first-execution', python_callable=py

How do I setup Airflow's email configuration to send an email on errors?

你说的曾经没有我的故事 提交于 2019-12-01 05:10:56
I'm trying to make an Airflow task intentionally fail and error out by passing in a Bash line ( thisshouldnotrun ) that doesn't work. Airflow is outputting the following: [2017-06-15 17:44:17,869] {bash_operator.py:94} INFO - /tmp/airflowtmpLFTMX7/run_bashm2MEsS: line 7: thisshouldnotrun: command not found [2017-06-15 17:44:17,869] {bash_operator.py:97} INFO - Command exited with return code 127 [2017-06-15 17:44:17,869] {models.py:1417} ERROR - Bash command failed Traceback (most recent call last): File "/home/ubuntu/.local/lib/python2.7/site-packages/airflow/models.py", line 1374, in run

Jobs not executing via Airflow that runs celery with RabbitMQ

筅森魡賤 提交于 2019-11-30 23:26:21
Below is the config im using [core] # The home folder for airflow, default is ~/airflow airflow_home = /root/airflow # The folder where your airflow pipelines live, most likely a # subfolder in a code repository dags_folder = /root/airflow/dags # The folder where airflow should store its log files. This location base_log_folder = /root/airflow/logs # An S3 location can be provided for log backups # For S3, use the full URL to the base folder (starting with "s3://...") s3_log_folder = None # The executor class that airflow should use. Choices include # SequentialExecutor, LocalExecutor,

Jobs not executing via Airflow that runs celery with RabbitMQ

淺唱寂寞╮ 提交于 2019-11-30 18:35:21
问题 Below is the config im using [core] # The home folder for airflow, default is ~/airflow airflow_home = /root/airflow # The folder where your airflow pipelines live, most likely a # subfolder in a code repository dags_folder = /root/airflow/dags # The folder where airflow should store its log files. This location base_log_folder = /root/airflow/logs # An S3 location can be provided for log backups # For S3, use the full URL to the base folder (starting with "s3://...") s3_log_folder = None #

Airflow failed slack message

|▌冷眼眸甩不掉的悲伤 提交于 2019-11-30 16:26:40
How can I configure Airflow so that any failure in the DAG will (immediately) result in a slack message? At this moment I manage it by creating a slack_failed_task: slack_failed_task = SlackAPIPostOperator( task_id='slack_failed', channel="#datalabs", trigger_rule='one_failed', token="...", text = ':red_circle: DAG Failed', icon_url = 'http://airbnb.io/img/projects/airflow3.png', dag=dag) And set this task (one_failed) upstream from each other task in the DAG: slack_failed_task << download_task_a slack_failed_task << download_task_b slack_failed_task << process_task_c slack_failed_task <<

Airflow: Log file isn't local, Unsupported remote log location

强颜欢笑 提交于 2019-11-30 12:13:37
I am not able see the logs attached to the tasks from the Airflow UI: Log related settings in airflow.cfg file are: remote_base_log_folder = base_log_folder = /home/my_projects/ksaprice_project/airflow/logs worker_log_server_port = 8793 child_process_log_directory = /home/my_projects/ksaprice_project/airflow/logs/scheduler Although I am setting remote_base_log_folter it is trying to fetch the log from http://:8793/log/tutorial/print_date/2017-08-02T00:00:00 - I don't understand this behavior. According to the settings the workers should store the logs at /home/my_projects/ksaprice_project

Airflow unpause dag programmatically?

我是研究僧i 提交于 2019-11-30 03:00:56
问题 I have a dag that we'll deploy to multiple different airflow instances and in our airflow.cfg we have dags_are_paused_at_creation = True but for this specific dag we want it to be turned on without having to do so manually by clicking on the UI. Is there a way to do it programmatically? 回答1: airflow-rest-api-plugin plugin can also be used to programmatically pause tasks. Pauses a DAG Available in Airflow Version: 1.7.0 or greater GET - http://{HOST}:{PORT}/admin/rest_api/api?api=pause Query

How to restart a failed task on Airflow

流过昼夜 提交于 2019-11-30 00:03:44
I am using a LocalExecutor and my dag has 3 tasks where task(C) is dependant on task(A). Task(B) and task(A) can run in parallel something like below A-->C B So task(A) has failed and but task(B) ran fine . Task(C) is yet to run as task(A) has failed. My question is how do i re run Task(A) alone so Task(C) runs once Task(A) completes and Airflow UI marks them as success. In the UI: Go to the dag, and dag run of the run you want to change Click on GraphView Click on task A Click "Clear" This will let task A run again, and if it succeeds, task C should run. This works because when you clear a

Airflow failed slack message

牧云@^-^@ 提交于 2019-11-29 23:38:32
问题 How can I configure Airflow so that any failure in the DAG will (immediately) result in a slack message? At this moment I manage it by creating a slack_failed_task: slack_failed_task = SlackAPIPostOperator( task_id='slack_failed', channel="#datalabs", trigger_rule='one_failed', token="...", text = ':red_circle: DAG Failed', icon_url = 'http://airbnb.io/img/projects/airflow3.png', dag=dag) And set this task (one_failed) upstream from each other task in the DAG: slack_failed_task << download

apache-airflow 1.9 default timezone set to non utc

笑着哭i 提交于 2019-11-29 12:49:48
问题 I had recently upgraded airflow version from airflow 1.8 to apache-airflow 1.9, the upgrade was successful and I have scaled the environment using Celery Executor, everything seemed to be working fine but the dag and tasks start dates, execution dates etc all are appearing in UTC timezone and the scheduled dags are running in UTC, earlier before the upgrade they used to run in Local timezone which is pdt. Any ideas on how to make pdt as the default timezone in airflow? I have tried using