How is Python scaling with Gunicorn and Kubernetes?

问题

I am going to deploy a Python Flask Server with Docker on Kubernetes using Gunicorn and Gevent/Eventlet as asynchronous workers. The application will:

Subscribe to around 20 different topics on Apache Kafka.
Score some machine learning models with that data.
Upload the results to a relational database.

Each topic in Kafka will receive 1 message per minute, so the application needs to consume around 20 messages per minute from Kafka. For each message, the handling and execution take around 45 seconds. The question is how I can scale this in a good way? I know that I can add multiple workers in Gunicorn and use multiple replicas of the pod when I deploy to Kubernetes. But is that enough? Will the workload be automatically balanced between the available workers in the different pods? Or what can I do to ensure scalability?

回答1:

I recommend you set up an HPA Horizontal Pod Autoscaler for your workers.

It will require to set up support for the metrics API. For custom metrics on the later versions of Kubernetes heapster has been deprecated in favor of the metrics server

If you are using the public Cloud like AWS, GCP, or Azure I'd also recommend setting up an Autoscaling Group so that you can scale your VMs or server base on metrics like CPU utilization average.

Hope it helps!

来源：https://stackoverflow.com/questions/52458393/how-is-python-scaling-with-gunicorn-and-kubernetes

标签

python

Docker

Flask

Kubernetes

gunicorn

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!