airflow

airflow cleared tasks not getting executed

我是研究僧i 提交于 2019-12-07 04:46:26
问题 Preamble Yet another airflow tasks not getting executed question... Everything was going more or less fine in my airflow experience up until this weekend when things really went downhill. I have checked all the standard things e.g. as outlined in this helpful post. I have reset the whole instance multiple times trying to get it working properly but I am totally losing the battle here. Environment version: airflow 1.10.2 os: centos 7 python: python 3.6 virtualenv: yes executor: LocalExecutor

Pass parameters to Airflow Experimental REST api when creating dag run

北城以北 提交于 2019-12-07 04:39:25
问题 Looks like Airflow has an experimental REST api that allow users to create dag runs with https POST request. This is awesome. Is there a way to pass parameters via HTTP to the create dag run? Judging from the official docs, found here, it would seem the answer is "no" but I'm hoping I'm wrong. 回答1: I had the same issue. "conf" value must be in string curl -X POST \ http://localhost:8080/api/experimental/dags/<DAG_ID>/dag_runs \ -H 'Cache-Control: no-cache' \ -H 'Content-Type: application/json

How to get the JobID for the airflow dag runs?

坚强是说给别人听的谎言 提交于 2019-12-07 02:07:23
问题 When we do a dagrun, on the Airflow UI, in the "Graph View" we get details of each job run. JobID is something like "scheduled__2017-04-11T10:47:00" . I need this JobID for tracking and log creation in which I maintain time each task/dagrun took. So my question is how can i get the JobID within the same dag that is being run . Thanks,Chetan 回答1: This value is actually called run_id and can be accessed via the context or macros. In the python operator this is accessed via context, and in the

How to use airflow xcoms with MySqlOperator

安稳与你 提交于 2019-12-07 01:50:32
问题 def mysql_operator_test(): DEFAULT_DATE = datetime(2017, 10, 9) t = MySqlOperator( task_id='basic_mysql', sql="SELECT count(*) from table 1 where id>100;", mysql_conn_id='mysql_default', dag=dag) t.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE, ignore_ti_state=False) run_this = PythonOperator( task_id='getRecoReq', python_callable=mysql_operator_test, # xcom_push=True, dag=dag) task2 = PythonOperator( task_id= 'mysql_select', provide_context=True, python_callable = blah, templates_dict =

Airflow will keep showing example dags even after removing it from configuration

对着背影说爱祢 提交于 2019-12-07 01:34:13
问题 Airflow example dags remain in the UI even after I have turned off load_examples = False in config file. The system informs the dags are not present in the dag folder but they remain in UI because the scheduler has marked it as active in the metadata database. I know one way to remove them from there would be to directly delete these rows in the database but off course this is not ideal.How should I proceed to remove these dags from UI? 回答1: There is currently no way of stopping a deleted DAG

TemplateNotFound when using Airflow's PostgresOperator with Jinja templating and SQL

左心房为你撑大大i 提交于 2019-12-07 00:07:57
问题 When trying to use Airflow's templating capabilities (via Jinja2) with the PostgresOperator, I've been unable to get things to render. It's quite possible I'm doing something wrong, but I'm pretty lost as to what the issue might be. Here's an example to reproduce the TemplateNotFound error I've been getting: airflow.cfg airflow_home = /home/gregreda/airflow dags_folder = /home/gregreda/airflow/dags relevant DAG and variables default_args = { 'owner': 'gregreda', 'start_date': datetime(2016, 6

How to wait for an asynchronous event in a task of a DAG in a workflow implemented using Airflow?

徘徊边缘 提交于 2019-12-06 21:22:42
问题 My workflow implemented using Airflow contains tasks A, B, C, and D. I want the workflow to wait at task C for an event. In Airflow sensors are used to check for some condition by polling for some state, if that condition is true then the next task in the workflow gets triggered. My requirement is to avoid polling. Here one answer mentions about a rest_api_plugin of airflow which creates rest_api endpoint to trigger airflow CLI - using this plugin I can trigger a task in the workflow. In my

External files in Airflow DAG

て烟熏妆下的殇ゞ 提交于 2019-12-06 19:37:05
问题 I'm trying to access external files in a Airflow Task to read some sql, and I'm getting "file not found". Has anyone come across this? from airflow import DAG from airflow.operators.python_operator import PythonOperator from datetime import datetime, timedelta dag = DAG( 'my_dat', start_date=datetime(2017, 1, 1), catchup=False, schedule_interval=timedelta(days=1) ) def run_query(): # read the query query = open('sql/queryfile.sql') # run the query execute(query) tas = PythonOperator( task_id=

supervisor管理airflow

馋奶兔 提交于 2019-12-06 16:27:30
#用airflow帐号 su - airflow . /home/airflow/venv/bin/activate pip install supervisor mkdir -p /home/airflow/venv/etc 拷贝G:\文档\大数据\airflow\ali-supervisord.conf到/home/airflow/venv/etc sudo chown airflow.airflow supervisord.conf supervisord -c /home/airflow/venv/etc/supervisord.conf 来源: https://www.cnblogs.com/hongfeng2019/p/11994435.html

Can you get a static external IP address for Google Cloud Composer / Airflow?

不想你离开。 提交于 2019-12-06 15:31:39
I know how to assign a static external IP address to a Compute Engine, but can this be done with Google Cloud Composer (Airflow)? I'd imagine most companies need that functionality since they'd generally be writing back to a warehouse that is possibly behind a firewall, but I can't find any doc's on how to do this. It's not possible to assign a static IP to the underlying GKE cluster in a Composer environment. The endpoint @kaxil mentioned is the Kubernetes master endpoint but not the GKE nodes. If the intent is to let all outgoing network connections from Composer tasks use the same external