airflow

airflow: error: unrecognized arguments: webserver

浪尽此生 提交于 2019-12-11 05:49:59
问题 I am trying to start my airflow webserver, but it says it is an unrecognised argument $ airflow webserver [2017-05-25 15:06:44,682] {__init__.py:36} INFO - Using executor CeleryExecutor ____________ _____________ ____ |__( )_________ __/__ /________ __ ____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / / ___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ / _/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/ [2017-05-25 15:06:45,099] {models.py:154} INFO - Filling up the DagBag from /home/ec2-user/airflow/dags usage:

how to remove airflow install

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-11 05:20:02
问题 I tried pip uninstall airflow and pip3 uninstall airflow and both return Cannot uninstall requirement airflow, not installed I'd like to remove airflow completely and run clean install. 回答1: Airflow now is apache-airflow . 来源: https://stackoverflow.com/questions/57694246/how-to-remove-airflow-install

BashOperator raising ImportError for a lib used in other PythonOperators

亡梦爱人 提交于 2019-12-11 05:20:00
问题 I have a set of tasks in my dag builder module which uses Python operator as used worldwide in Airflow. I am deploying airflow using docker on kubernetes. A task is failing with the error message: no module named pandas . The other tasks using pandas are successful. Yes, I did enter the container(workers) and found out that pip3 freeze does show up pandas. 2018-12-13 12:30:23,332] {bash_operator.py:87} INFO - Temporary script location: /tmp/airflowtmppkovwfth/pscript_pclean

KeyError: 'ibm_db_sa' when trying to use db2 with Apache Airflow

坚强是说给别人听的谎言 提交于 2019-12-11 05:19:00
问题 I've setup a database connection using sql_alchemy_conn = ibm_db_sa://{USERNAME}:{PASSWORD}@{HOST}:50000/airflow in the airflow.cfg file. When I run airflow initdb , it pops up KeyError: 'ibm_db_sa' . How can I use a DB2 connection with Airflow? =============== Here is more specific error message: airflow initdb [2017-02-01 15:55:57,135] {__init__.py:36} INFO - Using executor SequentialExecutor DB: ibm_db_sa://db2inst1:***@localhost:50000/airflow [2017-02-01 15:55:58,151] {db.py:222} INFO -

Python - AttributeError: 'NoneType' object has no attribute 'execute'

安稳与你 提交于 2019-12-11 05:09:53
问题 I am trying to run a python script that logs into Amazon Redshift DB and then execute a SQL command. I use a tool called Airflow for workflow management. When running the below code, I am able to login fine to the DB but when trying to execute the SQL command get the below error. **AttributeError: 'NoneType' object has no attribute 'execute'** Code: ## Login to DB def db_log(**kwargs): global db_con try: db_con = psycopg2.connect( " dbname = 'name' user = 'user' password = 'pass' host = 'host

Airflow's Gunicorn is spamming error logs

邮差的信 提交于 2019-12-11 05:03:32
问题 I'm using Apache Airflow and recognized that the size of the gunicorn-error.log grown over 50 GB within 5 months. Most of the log messages are INFO level logs like: [2018-05-14 17:31:39 +0000] [29595] [INFO] Handling signal: ttou [2018-05-14 17:32:37 +0000] [2359] [INFO] Worker exiting (pid: 2359) [2018-05-14 17:33:07 +0000] [29595] [INFO] Handling signal: ttin [2018-05-14 17:33:07 +0000] [5758] [INFO] Booting worker with pid: 5758 [2018-05-14 17:33:10 +0000] [29595] [INFO] Handling signal:

Airflow add a UI button for each DAG

女生的网名这么多〃 提交于 2019-12-11 04:46:31
问题 By default each DAG has a bunch of buttons ( Trigger Dag , Delete Dag etc) in the main (Admin) view in the UI. I have been trying to add a buttonlike the ones described above that every time you click it it send an Http request. I have successfully used these plugin: https://github.com/airflow-plugins/Getting-Started/blob/master/Tutorial/creating-ui-modification.md But it does not do what I need and digging around airflow code and the docs for plugins didnt help much either, is it possible to

Airflow DAG not getting scheduled

徘徊边缘 提交于 2019-12-11 04:31:49
问题 I am new to Airflow and created my first DAG. Here is my DAG code. I want the DAG to start now and thereafter run once in a day. from airflow import DAG from airflow.operators.bash_operator import BashOperator from datetime import datetime, timedelta default_args = { 'owner': 'airflow', 'depends_on_past': False, 'start_date': datetime.now(), 'email': ['aaaa@gmail.com'], 'email_on_failure': False, 'email_on_retry': False, 'retries': 1, 'retry_delay': timedelta(minutes=5), } dag = DAG( 'alamode

Yet another “This DAG isn't available in the webserver DagBag object ”

强颜欢笑 提交于 2019-12-11 04:06:49
问题 This seems to be a fairly common issue. I have a DAG where, not only can I trigger it manually with airflow trigger_dag , but it's even executing according to its schedule, but it refuses to show up in the UI. I've already, restarted the webserver and scheduler multiple times, pressed "refresh" like a billion times, and ran it through airflow backfill . Anyone have any other ideas? Any other pertinent information I can provide? I'm on Airflow 1.9.0. 回答1: I have been debugging this exact

Airflow and Spark/Hadoop - Unique cluster or one for Airflow and other for Spark/Hadoop

穿精又带淫゛_ 提交于 2019-12-11 03:47:34
问题 I'm trying to figure out which is the best way to work with Airflow and Spark/Hadoop. I already have a Spark/Hadoop cluster and I'm thinking about creating another cluster for Airflow that will submit jobs remotely to Spark/Hadoop cluster. Any advice about it? Looks like it's a little complicated to deploy spark remotely from another cluster and that will create some file configuration duplication. 回答1: You really only need to configure a yarn-site.xml file, I believe, in order for spark