airflow

Implementing Postgres Sql in Apache Airflow

心已入冬 提交于 2020-06-12 08:48:23
问题 I have Apache-Airflow implemented on an Ubuntu version 18.04.3 server. When I set it up, I used the sql lite generic database, and this uses the sequential executor. I did this just to play around and get used to the system. Now I'm trying to use the Local Executor, and will need to transition my database from sqlite to the recommended postgres sql. Does anybody know how to make this transition? All of the tutorials I've found entail setting up Airflow with postgres sql from the beginning. I

Airflow sql_path not able to read the sql files when passed as Jinja Template Variable

只谈情不闲聊 提交于 2020-06-01 07:40:27
问题 I am trying to use Jinja template variable as against using Variable.get('sql_path'), So as to avoid hitting DB for every scan of the dag file Original code import datetime import os from functools import partial from datetime import timedelta from airflow.models import DAG,Variable from airflow.contrib.operators.snowflake_operator import SnowflakeOperator from alerts.email_operator import dag_failure_email SNOWFLAKE_CONN_ID = 'etl_conn' tmpl_search_path = [] for subdir in ['business/',

Airflow sql_path not able to read the sql files when passed as Jinja Template Variable

為{幸葍}努か 提交于 2020-06-01 07:40:05
问题 I am trying to use Jinja template variable as against using Variable.get('sql_path'), So as to avoid hitting DB for every scan of the dag file Original code import datetime import os from functools import partial from datetime import timedelta from airflow.models import DAG,Variable from airflow.contrib.operators.snowflake_operator import SnowflakeOperator from alerts.email_operator import dag_failure_email SNOWFLAKE_CONN_ID = 'etl_conn' tmpl_search_path = [] for subdir in ['business/',

Why is there an automatic DAG 'airflow_monitoring' generated in GCP Composer?

妖精的绣舞 提交于 2020-05-29 10:35:47
问题 When creating an Airflow environment on GCP Composer, there is a DAG named "airflow_monitoring" automatically created and that is impossible to delete. Why ? How to handle it ? Should I copy this file inside my DAG folder and resign myself to make it part of my code ? I noticed that each time I upload my code it stops the execution of this DAG as it could not be found inside the DAG folder until it magically reappears. I have already tried deleting it inside the DAG folder, delete the logs,

Why is there an automatic DAG 'airflow_monitoring' generated in GCP Composer?

跟風遠走 提交于 2020-05-29 10:34:23
问题 When creating an Airflow environment on GCP Composer, there is a DAG named "airflow_monitoring" automatically created and that is impossible to delete. Why ? How to handle it ? Should I copy this file inside my DAG folder and resign myself to make it part of my code ? I noticed that each time I upload my code it stops the execution of this DAG as it could not be found inside the DAG folder until it magically reappears. I have already tried deleting it inside the DAG folder, delete the logs,

Apache Airflow: operator to copy s3 to s3

淺唱寂寞╮ 提交于 2020-05-28 04:40:27
问题 What is the best operator to copy a file from one s3 to another s3 in airflow? I tried S3FileTransformOperator already but it required either transform_script or select_expression. My requirement is to copy the exact file from source to destination. 回答1: You have 2 options (even when I disregard Airflow ) Use AWS CLI : cp command aws s3 cp <source> <destination> In Airflow this command can be run using BashOperator (local machine) or SSHOperator (remote machine) Use AWS SDK aka boto3 Here you

How do I check when my next Airflow DAG run has been scheduled for a specific dag?

拟墨画扇 提交于 2020-05-27 02:58:18
问题 I have airflow set up and running with some DAGs scheduled for once a day "0 0 * * *". I want to check when is the next time a specific dag has been scheduled to run, but I can't see where I can do that within the admin. 回答1: If you want you use the Airflow 's CLI , there's next_execution option Get the next execution datetime of a DAG. airflow next_execution [-h] [-sd SUBDIR] dag_id UPDATE-1 If you need to do it programmatically (within an Airflow task ), you can refer to next_execution(..)

Airflow won't write logs to s3

爷,独闯天下 提交于 2020-05-26 04:11:07
问题 I tried different ways to configure Airflow 1.9 to write logs to s3 however it just ignores it. I found a lot of people having problems reading the Logs after doing so, however my problem is that the Logs remain local. I can read them without problem but they are not in the specified s3 bucket. What I tried was first to write into the airflow.cfg file # Airflow can store logs remotely in AWS S3 or Google Cloud Storage. Users # must supply an Airflow connection id that provides access to the

Airflow won't write logs to s3

一笑奈何 提交于 2020-05-26 04:11:02
问题 I tried different ways to configure Airflow 1.9 to write logs to s3 however it just ignores it. I found a lot of people having problems reading the Logs after doing so, however my problem is that the Logs remain local. I can read them without problem but they are not in the specified s3 bucket. What I tried was first to write into the airflow.cfg file # Airflow can store logs remotely in AWS S3 or Google Cloud Storage. Users # must supply an Airflow connection id that provides access to the

Airflow won't write logs to s3

旧巷老猫 提交于 2020-05-26 04:10:27
问题 I tried different ways to configure Airflow 1.9 to write logs to s3 however it just ignores it. I found a lot of people having problems reading the Logs after doing so, however my problem is that the Logs remain local. I can read them without problem but they are not in the specified s3 bucket. What I tried was first to write into the airflow.cfg file # Airflow can store logs remotely in AWS S3 or Google Cloud Storage. Users # must supply an Airflow connection id that provides access to the