airflow

Airflow 1.9.0 ExternalTaskSensor retry_delay=30 yields TypeError: can't pickle _thread.RLock objects

故事扮演 提交于 2021-02-10 18:13:06
问题 As the titles says; in Airflow 1.9.0 if you use the retry_delay=30 (or any other number) parameter with the ExternalTaskSensor, the DAG will run just fine, until you want to clear the task instances in the airflow GUI -> it will return the following error: "TypeError: can't pickle _thread.RLock objects" (and a nice Oops message) But if you use retry_delay=timedelta(seconds=30) clearing task instances works fine. If I look through the models.py method, the deepcopy should go fine, so it seems

Not able to setup airflow, getting error while “Initiating Airflow Database”

三世轮回 提交于 2021-02-10 06:53:56
问题 Not able to setup airflow, getting error while "Initiating Airflow Database" . I am getting the below error: File "/Library/Frameworks/Python.framework/Versions/3.8/bin/airflow", line 26, in <module> from airflow.bin.cli import CLIFactory File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/airflow/bin/cli.py", line 79, in <module> api_module = import_module(conf.get('cli', 'api_client')) # type: Any File "/Library/Frameworks/Python.framework/Versions/3.8/lib

Not able to setup airflow, getting error while “Initiating Airflow Database”

巧了我就是萌 提交于 2021-02-10 06:52:08
问题 Not able to setup airflow, getting error while "Initiating Airflow Database" . I am getting the below error: File "/Library/Frameworks/Python.framework/Versions/3.8/bin/airflow", line 26, in <module> from airflow.bin.cli import CLIFactory File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/airflow/bin/cli.py", line 79, in <module> api_module = import_module(conf.get('cli', 'api_client')) # type: Any File "/Library/Frameworks/Python.framework/Versions/3.8/lib

airflow pool used slots is greater than slots limit

≯℡__Kan透↙ 提交于 2021-02-10 05:34:49
问题 There are three Sensor task and use same pool, the pool 'limit_sensor' is set to 1, but the pool limit not work, three pool is running together sensor_wait = SqlSensor( task_id='sensor_wait', dag=dag, conn_id='dest_data', sql="select count(*) from test", poke_interval=10, timeout=60, pool='limit_sensor', priority_weight=100 ) same_pool1 = SqlSensor( task_id='same_pool1', dag=dag, conn_id='dest_data', sql="select count(*) from test", poke_interval=10, timeout=60, pool='limit_sensor', priority

Run airflow DAG for each file

烈酒焚心 提交于 2021-02-10 04:14:08
问题 So I have this quite nice DAG in airflow which basically runs several analysis steps (implemented as airflow plugins) on binary files. A DAG is triggert by an ftp sensor which just checks if there is a new file on the ftp server and then starts the whole workflow. So currently the workflow is like this: DAG is triggert as defined -> sensor waits for new file on ftp -> analysis steps are executed -> end of workflow. What I'd like to have is something like this: DAG is triggerts -> sensor waits

Run airflow DAG for each file

梦想与她 提交于 2021-02-10 04:11:22
问题 So I have this quite nice DAG in airflow which basically runs several analysis steps (implemented as airflow plugins) on binary files. A DAG is triggert by an ftp sensor which just checks if there is a new file on the ftp server and then starts the whole workflow. So currently the workflow is like this: DAG is triggert as defined -> sensor waits for new file on ftp -> analysis steps are executed -> end of workflow. What I'd like to have is something like this: DAG is triggerts -> sensor waits

Run airflow DAG for each file

青春壹個敷衍的年華 提交于 2021-02-10 04:09:29
问题 So I have this quite nice DAG in airflow which basically runs several analysis steps (implemented as airflow plugins) on binary files. A DAG is triggert by an ftp sensor which just checks if there is a new file on the ftp server and then starts the whole workflow. So currently the workflow is like this: DAG is triggert as defined -> sensor waits for new file on ftp -> analysis steps are executed -> end of workflow. What I'd like to have is something like this: DAG is triggerts -> sensor waits

how to pass query parameter to sql file using bigquery operator

房东的猫 提交于 2021-02-09 11:11:26
问题 I need access the parameter passed by BigqueryOperator in sql file, but I am getting error ERROR - queryParameters argument must have a type <class 'dict'> not <class 'list'> I am using below code: t2 = bigquery_operator.BigQueryOperator( task_id='bq_from_source_to_clean', sql='prepare.sql', use_legacy_sql=False, allow_large_results=True, query_params=[{ 'name': 'threshold_date', 'parameterType': { 'type': 'STRING' },'parameterValue': { 'value': '2020-01-01' } }], destination_dataset_table="{

how to pass query parameter to sql file using bigquery operator

蹲街弑〆低调 提交于 2021-02-09 11:10:50
问题 I need access the parameter passed by BigqueryOperator in sql file, but I am getting error ERROR - queryParameters argument must have a type <class 'dict'> not <class 'list'> I am using below code: t2 = bigquery_operator.BigQueryOperator( task_id='bq_from_source_to_clean', sql='prepare.sql', use_legacy_sql=False, allow_large_results=True, query_params=[{ 'name': 'threshold_date', 'parameterType': { 'type': 'STRING' },'parameterValue': { 'value': '2020-01-01' } }], destination_dataset_table="{

how to pass query parameter to sql file using bigquery operator

会有一股神秘感。 提交于 2021-02-09 11:08:27
问题 I need access the parameter passed by BigqueryOperator in sql file, but I am getting error ERROR - queryParameters argument must have a type <class 'dict'> not <class 'list'> I am using below code: t2 = bigquery_operator.BigQueryOperator( task_id='bq_from_source_to_clean', sql='prepare.sql', use_legacy_sql=False, allow_large_results=True, query_params=[{ 'name': 'threshold_date', 'parameterType': { 'type': 'STRING' },'parameterValue': { 'value': '2020-01-01' } }], destination_dataset_table="{