问题
I am pretty desperate searching for a solution to this. I am running a Kubernetes Cluster (v1.16.7) on AWS.
Node specs are: It is an Amazon EC2 t3.medium instance with 4GB RAM and AMI: k8s-1.11-debian-stretch-amd64-hvm-ebs-2018-08-17 with kernel: 4.9.0-7-amd64
My main problem is that I see increased memory usage in the kernel, which leads to faster memory starvation issues in my node. More specifically:
free -m
:
total used free shared buff/cache available
Mem: 3895 3470 130 3 294 204
Swap: 0 0 0
This currently shows that my actual used (non-cache or reclaimable memory) is around 3.4GB.
Also the output of sudo smem -twk
:
Area Used Cache Noncache
firmware/hardware 0 0 0
kernel image 0 0 0
kernel dynamic memory 1.5G 184.1M 1.3G
userspace memory 2.2G 111.1M 2.1G
free memory 125.5M 125.5M 0
----------------------------------------------------------
3.8G 420.7M 3.4G
matches the output of free
in the following way:
- used column in
free
=smem
kernel NonCache + userspace Noncache = 3.4GB - buff/cache columne in
free
=smem
kernel Cache + userspace Cache = 294MB
Also kubectl top node
matches the userspace memory in smem
showing around 2.2GB and so does the total of top
and ps aux
of the running processes.
However my /proc/meminfo/
:
MemTotal: 3989436 kB
MemFree: 133272 kB
MemAvailable: 209416 kB
Buffers: 10472 kB
Cached: 255628 kB
SwapCached: 0 kB
Active: 2340712 kB
Inactive: 80612 kB
Active(anon): 2156712 kB
Inactive(anon): 1752 kB
Active(file): 184000 kB
Inactive(file): 78860 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 1404 kB
Writeback: 0 kB
AnonPages: 2155264 kB
Mapped: 111500 kB
Shmem: 3220 kB
Slab: 121856 kB
SReclaimable: 36260 kB
SUnreclaim: 85596 kB
KernelStack: 17440 kB
PageTables: 32972 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 1994716 kB
Committed_AS: 8704948 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 518120 kB
DirectMap2M: 3614720 kB
DirectMap1G: 0 kB
shows a total of kernel memory usage Slab + SReclaimable + SUnreclaim of ~238MB which is nowhere near 1.3GB shown in smem
which also sums up in the free
report.
So where is the extra memory in the kernel spent???
Are there any other ways to check where kernel memory is used?
Thanks!
UPDATE
After many trials and experimentation with configuration, the problem is narrowed down to FluentD logging System.
We have an in-app logging mechanism, that targets a FluentD service using TCP @type forward
source which then is sent to ElasticSearch using @type elasticsearch
*match. Same FluentD service also captures local logfiles and sent to Elastic without any problem, so it seems that it has something to do with TCP communication...
Image used is quay.io/fluentd_elasticsearch/fluentd:v3.1.0
from https://github.com/kokuwaio/helm-charts/tree/main/charts/fluentd-elasticsearch
v11.3.0 helm chart
来源:https://stackoverflow.com/questions/65024698/high-kernel-memory-usage-in-kubernetes-node