Airbnb Airflow using all system resources

后端 未结 7 1962
温柔的废话
温柔的废话 2021-02-03 19:48

We\'ve set up Airbnb/Apache Airflow for our ETL using LocalExecutor, and as we\'ve started building more complex DAGs, we\'ve noticed that Airflow has starting usin

7条回答
  •  旧巷少年郎
    2021-02-03 20:09

    For starters, you can use htop to monitor and debug your CPU usage.

    I would suggest that you run webserver and scheduler processes on the same docker container which would reduce the resources required to run two containers on a ec2 t2.medium. Airflow workers need resources for downloading data and reading it in memory but webserver and scheduler are pretty lightweight processes. Makes sure when you run webserver you are controlling the number of workers running on the instance using the cli.

    airflow webserver [-h] [-p PORT] [-w WORKERS]
                             [-k {sync,eventlet,gevent,tornado}]
                             [-t WORKER_TIMEOUT] [-hn HOSTNAME] [--pid [PID]] [-D]
                             [--stdout STDOUT] [--stderr STDERR]
                             [-A ACCESS_LOGFILE] [-E ERROR_LOGFILE] [-l LOG_FILE]
                             [-d]
    

提交回复
热议问题