Cluster autoscaler not downscaling

点点圈 提交于 2019-11-30 05:21:36

Answering myself for visibility.

The problem is that the CA never considers moving anything unless all the requirements mentioned in the FAQ are met at the same time. So lets say I have 100 nodes with 51% CPU requests. It still wont consider downscaling.

One solution is to increase the value at which CA checks, now 50%. But unfortunately that is not supported by GKE, see answer from google support @GalloCedrone:

Moreover I know that this value might sound too low and someone could be interested to keep as well a 85% or 90% to avoid your scenario. Currently there is a feature request open to give the user the possibility to modify the flag "--scale-down-utilization-threshold", but it is not implemented yet.

The workaround I found is to decrease the CPU request (100m instead of 300m) of the pods and have the Horizontal Pod Autoscaler (HPA) create more on demand. This is fine for me but if your application is not suitable for many small instances you are out of luck. Perhaps a cron job that cordons a node if the total utilization is low?

I agree that according to Documentation it seems that "gke-name-cluster-default-pool" could be safely deleted, conditions:

  • The sum of cpu and memory requests of all pods running on this node is smaller than 50% of the node's allocatable.
  • All pods running on the node (except these that run on all nodes by default, like manifest-run pods or pods created by daemonsets) can be moved to other nodes.
  • It doesn't have scale-down disabled annotation Therefore there should remove it after 10 minutes it is considered not needed.

However checking the Documentation I found:

What types of pods can prevent CA from removing a node?

[...] Kube-system pods that are not run on the node by default, * [..]

heapster-v1.5.2--- is running on the node and it is a Kube-system pod that is not run on the node by default.

I will update the answer if I discover more interesting information.

UPDATE

The fact that the node it is the last one in the zone is not an issue.

To prove it I tested on a brand new cluster with 3 nodes each one in a different zone, one of them was without any workload apart from "kube-proxy" and "fluentd" and was correctly deleted even if it was bringing the size of the zone to zero.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!