apache-airflow

How to add template variable in the filename of an EmailOperator task? (Airflow)

◇◆丶佛笑我妖孽 提交于 2021-02-07 06:28:05
问题 I can't seem to get this to work. I am trying to send daily a given file, whose name is like 'file_{{ds_nodash}}.csv'. The problem is that I can't seem to add this name as the filename, since it seems it cant be used. In the text of the email or the subject works perfectly, not not on the name. Here is the dag as an example: local_file = 'file-{{ds_nodash}}.csv' send_stats_csv = EmailOperator( task_id='send-stats-csv', to=['email@gmail.com'], subject='Subject - {{ ds }}', html_content='Here

Guarantee that some operators will be executed on the same airflow worker

痞子三分冷 提交于 2021-01-28 07:33:15
问题 I have a DAG which downloads a csv file from cloud storage uploads the csv file to a 3rd party via https The airflow cluster I am executing on uses CeleryExecutor by default, so I'm worried that at some point when I scale up the number of workers, these tasks may be executed on different workers. eg. worker A does the download, worker B tries to upload, but doesn't find the file (because it's on worker A) Is it possible to somehow guarantee that both the download and upload operators will be

Airflow CROSSSLOT Keys in request don't hash to the same slot error using AWS ElastiCache

穿精又带淫゛_ 提交于 2021-01-27 19:56:07
问题 I am running apache-airflow 1.8.1 on AWS ECS and I have an AWS ElastiCache cluster (redis 3.2.4) running 2 shards / 2 nodes with multi-AZ enabled (clustered redis engine). I've verified that airflow can access the host/port of the cluster without any problem. Here's the logs: Thu Jul 20 01:39:21 UTC 2017 - Checking for redis (endpoint: redis://xxxxxx.xxxxxx.clustercfg.usw2.cache.amazonaws.com:6379) connectivity Thu Jul 20 01:39:21 UTC 2017 - Connected to redis (endpoint: redis://xxxxxx.xxxxxx

(Django) ORM in airflow - is it possible?

时光怂恿深爱的人放手 提交于 2020-05-10 07:42:23
问题 How to work with Django models inside Airflow tasks? According to official Airflow documentation, Airflow provides hooks for interaction with databases (like MySqlHook / PostgresHook / etc) that can be later used in Operators for row query execution. Attaching the core code fragments: Copy from https://airflow.apache.org/_modules/mysql_hook.html class MySqlHook(DbApiHook): conn_name_attr = 'mysql_conn_id' default_conn_name = 'mysql_default' supports_autocommit = True def get_conn(self): """

(Django) ORM in airflow - is it possible?

青春壹個敷衍的年華 提交于 2020-05-10 07:42:08
问题 How to work with Django models inside Airflow tasks? According to official Airflow documentation, Airflow provides hooks for interaction with databases (like MySqlHook / PostgresHook / etc) that can be later used in Operators for row query execution. Attaching the core code fragments: Copy from https://airflow.apache.org/_modules/mysql_hook.html class MySqlHook(DbApiHook): conn_name_attr = 'mysql_conn_id' default_conn_name = 'mysql_default' supports_autocommit = True def get_conn(self): """

Run airflow process and airflow webserver as airflow user

*爱你&永不变心* 提交于 2020-01-30 09:00:08
问题 Problem : I am setting up a Google Compute Engine VM on GCP with airflow installed on it. I am now trying to integrate airflow with systemd by following instructions on http://airflow.readthedocs.io/en/latest/configuration.html#integration-with-systemd, however it states an assumption that Airflow will run under airflow:airflow . How can I set the airflow installation so that whenever any user on that VM runs airflow from the shell, on backend it runs as airflow user. It is similar to hive

Run airflow process and airflow webserver as airflow user

强颜欢笑 提交于 2020-01-30 09:00:06
问题 Problem : I am setting up a Google Compute Engine VM on GCP with airflow installed on it. I am now trying to integrate airflow with systemd by following instructions on http://airflow.readthedocs.io/en/latest/configuration.html#integration-with-systemd, however it states an assumption that Airflow will run under airflow:airflow . How can I set the airflow installation so that whenever any user on that VM runs airflow from the shell, on backend it runs as airflow user. It is similar to hive