Running `airflow scheduler` launches 33 scheduler processes

﹥>﹥吖頭↗ 提交于 2020-12-25 02:32:21

问题


When using LocalExecutor with a MySQL backend, running airflow scheduler on my Centos 6 box creates 33 scheduler processes, e.g. deploy 55362 13.5 1.8 574224 73272 ? Sl 18:59 7:42 /usr/local/bin/python2.7 /usr/local/bin/airflow scheduler deploy 55372 0.0 1.5 567928 60552 ? Sl 18:59 0:00 /usr/local/bin/python2.7 /usr/local/bin/airflow scheduler deploy 55373 0.0 1.5 567928 60540 ? Sl 18:59 0:00 /usr/local/bin/python2.7 /usr/local/bin/airflow scheduler ... These are distinct from Executor processes and gunicorn master and worker processes. Running it with the SequentialExecutor (sqlite backend) just kicks off one scheduler process.
Airflow still works (DAGs are getting run), but the sheer number of these processes makes me think something is wrong.
When I run select * from job where state = 'running'; in the database, only 5 SchedulerJob rows get returned. Is this normal?


回答1:


Yes this is normal. These are scheduler processes. You can control this using below parameter in airflow.cfg

# The amount of parallelism as a setting to the executor. This defines
# the max number of task instances that should run simultaneously
# on this airflow installation
parallelism = 32

These are spawned from scheduler whose pid can be found in airflow-scheduler.pid file

so 32+1=33 processes that you are seeing.

Hope this clears out your doubt.

Cheers!




回答2:


As of v1.10.3, this is what I found. My settings are:

parallelism = 32
max_threads = 4

There are a total of

  • 1 (main) +
  • 32 (executors) +
  • 1 (dag_processor_manager) +
  • 4 (dag processors)

= 38 processes!



来源:https://stackoverflow.com/questions/42729161/running-airflow-scheduler-launches-33-scheduler-processes

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!