airflow

Airflow Scheduler out of memory problems

我只是一个虾纸丫 提交于 2020-12-04 03:44:21
问题 We are experimenting with Apache Airflow (version 1.10rc2, with python 2.7) and deploying it to kubernetes, webserver and scheduler to different pods, and the database is as well using cloud sql, but we have been facing out of memory problems with the scheduler pod. At the moment of the OOM, we were running only 4 example Dags (approximately 20 tasks). The memory for the pod is 1Gib. I've seen in other posts that a task might consume approximately 50Mib of memory when running, and all task

Airflow Scheduler out of memory problems

痞子三分冷 提交于 2020-12-04 03:42:38
问题 We are experimenting with Apache Airflow (version 1.10rc2, with python 2.7) and deploying it to kubernetes, webserver and scheduler to different pods, and the database is as well using cloud sql, but we have been facing out of memory problems with the scheduler pod. At the moment of the OOM, we were running only 4 example Dags (approximately 20 tasks). The memory for the pod is 1Gib. I've seen in other posts that a task might consume approximately 50Mib of memory when running, and all task

how to clear failing DAGs using the CLI in airflow

故事扮演 提交于 2020-12-03 06:21:11
问题 I have some failing DAGs, let's say from 1st-Feb to 20th-Feb. From that date upword, all of them succeeded. I tried to use the cli (instead of doing it twenty times with the Web UI): airflow clear -f -t * my_dags.my_dag_id But I have a weird error: airflow: error: unrecognized arguments: airflow-webserver.pid airflow.cfg airflow_variables.json my_dags.my_dag_id EDIT 1: Like @tobi6 explained it, the * was indeed causing troubles. Knowing that, I tried this command instead: airflow clear -u -d

How do we trigger multiple airflow dags using TriggerDagRunOperator?

感情迁移 提交于 2020-12-03 05:32:13
问题 I have a scenario wherein a particular dag upon completion needs to trigger multiple dags,have used TriggerDagRunOperator to trigger single dag,is it possible to pass multiple dags to the TriggerDagRunOperator to trigger multiple dags? And is it possible to trigger only upon successful completion of the current dag. 回答1: I have faced the same problem. And there is no solution out of the box, but we can write a custom operator for it. So here the code of a custom operator, that get python

How do we trigger multiple airflow dags using TriggerDagRunOperator?

南笙酒味 提交于 2020-12-03 05:28:12
问题 I have a scenario wherein a particular dag upon completion needs to trigger multiple dags,have used TriggerDagRunOperator to trigger single dag,is it possible to pass multiple dags to the TriggerDagRunOperator to trigger multiple dags? And is it possible to trigger only upon successful completion of the current dag. 回答1: I have faced the same problem. And there is no solution out of the box, but we can write a custom operator for it. So here the code of a custom operator, that get python

apache airflow: initdb vs resetdb

与世无争的帅哥 提交于 2020-11-28 07:22:16
问题 What precisely is the difference between the "airflow initdb" command and the "airflow resetdb" command? Is it really necessary to have 2 different commands? When is it appropriate to use one vs the other? The doc says ... airflow initdb : Initialize the metadata database airflow resetdb : Burn down and rebuild the metadata database This doesn't tell me much. My best guess is that airflow initdb is to be used only the first time that the database is created from the airflow.cfg airflow

apache airflow: initdb vs resetdb

心不动则不痛 提交于 2020-11-28 07:13:49
问题 What precisely is the difference between the "airflow initdb" command and the "airflow resetdb" command? Is it really necessary to have 2 different commands? When is it appropriate to use one vs the other? The doc says ... airflow initdb : Initialize the metadata database airflow resetdb : Burn down and rebuild the metadata database This doesn't tell me much. My best guess is that airflow initdb is to be used only the first time that the database is created from the airflow.cfg airflow

apache airflow: initdb vs resetdb

梦想与她 提交于 2020-11-28 07:12:21
问题 What precisely is the difference between the "airflow initdb" command and the "airflow resetdb" command? Is it really necessary to have 2 different commands? When is it appropriate to use one vs the other? The doc says ... airflow initdb : Initialize the metadata database airflow resetdb : Burn down and rebuild the metadata database This doesn't tell me much. My best guess is that airflow initdb is to be used only the first time that the database is created from the airflow.cfg airflow

apache airflow: initdb vs resetdb

做~自己de王妃 提交于 2020-11-28 07:08:50
问题 What precisely is the difference between the "airflow initdb" command and the "airflow resetdb" command? Is it really necessary to have 2 different commands? When is it appropriate to use one vs the other? The doc says ... airflow initdb : Initialize the metadata database airflow resetdb : Burn down and rebuild the metadata database This doesn't tell me much. My best guess is that airflow initdb is to be used only the first time that the database is created from the airflow.cfg airflow