How does K8s enable cgroup2 support?

This article was last updated on: February 7, 2024 pm

What is cgroup

📚️Reference:

Control groups, often referred to as cgroups, are a feature of the Linux kernel. It allows processes to be organized into hierarchical groups and then limit and monitor the use of various resources. The kernel’s cgroup interface is provided through a pseudo-file system called cgroupfs. Grouping is implemented in the core’s cgroup kernel code, while resource tracking and throttling are implemented in a set of subsystems for each resource type (memory, CPU, and so on).

cgroup is the underlying technology stack for containers and cloud native. Both kubelets and CRIs need to interface with cgroups to enforce the management of resources for pods and containers, i.e., requests/limits and cpu/memory.

There are two versions of cgroup in Linux: cgroup v1 and cgroup v2. cgroup v2 is a new generation of cgroup APIs.

Kubernetes cgroup2 features officially stable.

What are the advantages of cgroup v2

📚️Reference:

cgroup v2 provides a unified control system with enhanced resource management capabilities.

cgroup v2 makes several improvements to cgroup v1, such as:

A single, unified hierarchical design in the API

More secure subtrees are delegated to containers

Updated features, egPressure Stall Information (PSI)

Enhanced resource allocation management and isolation across multiple resources

Unified accounting of different types of memory allocations (network memory, kernel memory, etc.)

Consider non-immediate resource changes, such as page cache writeback

Some Kubernetes features specifically use cgroup v2 to enhance resource management and isolation. For exampleMemoryQoS The Memory QoS feature improves memory QoS and relies on cgroup v2 primitives.

Use the cgroup v2 prerequisite

📚️Reference:

cgroup v2 has the following requirements:

The operating system release enables cgroup v2

Ubuntu (starting with 21.10, 22.04+ recommended)

Debian GNU/Linux (starting with Debian 11 Bullseye)

Fedora (from 31)

RHEL and RHEL-like distributions (starting at 9)

…

The Linux kernel is 5.8 or later

The container runtime supports cgroup v2. For example:

containerd v1.4 and later

cri-o v1.20 and later

The kubelet and container runtime are configured to use Systemd cgroup driver

Use cgroup v2

📝Notes:

Here we take Debian 11 Bullseye + containerd v1.4 as an example.

Enable and check cgroup v2 for Linux nodes

Debian 11 Bullseye has cgroup v2 enabled by default.

This can be verified by the following command:

1	`stat -fc %T /sys/fs/cgroup/`

For cgroup v2, the output is cgroup2fs。
For cgroup v1, the output is tmpfs。

If it is not enabled, it can be done through the /etc/default/grub lower GRUB_CMDLINE_LINUX added systemd.unified_cgroup_hierarchy=1, and execute sudo update-grub

📝Notes:
In the case of a Raspberry Pi, the standard Raspberry Pi OS is not enabled when installed cgroups。 need cgroups to start the systemd service. You can do this by converting the cgroup_memory=1 cgroup_enable=memory systemd.unified_cgroup_hierarchy=1 Attach to /boot/cmdline.txt to enable cgroups。
and restart to take effect

The kubelet uses the systemd cgroup driver

Kubeadm support in execution kubeadm init , pass one KubeletConfiguration Structure. KubeletConfiguration contain cgroupDriver field, which can be used to control the cgroup driver of the kubelet.

Illustrate: In version 1.22, if the user is not there KubeletConfiguration in Settings cgroupDriver field kubeadm init It is set as the default systemd。

Here is a minimized example where this field is explicitly configured:

# kubeadm-config.yaml
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
kubernetesVersion: v1.21.0

1
2
3

kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd

Such a configuration file can be passed to the kubeadm command:

1	`kubeadm init --config kubeadm-config.yaml`

Illustrate:

Kubeadm uses the same for all nodes in the cluster KubeletConfiguration。 KubeletConfiguration Stored in kube-system One under the namespace ConfigMap object.

execute init、join and upgrade and other subcommands will cause kubeadm to will KubeletConfiguration Write to a file /var/lib/kubelet/config.yaml , which in turn passes it to the local node’s kubelet.

Containerd uses the systemd cgroup driver

edit /etc/containerd/config.toml:

1 2	`[plugins.cri.containerd.runtimes.runc.options] SystemdCgroup = true`

Upgrade monitoring components to support cgroup v2 monitoring

📚️Reference:

cgroup v2 uses a different API than cgroup v1, so if any apps access the cgroup file system directly, they will need to be updated to support the version of cgroup v2. For example:

Some third-party monitoring and security agents may rely on the cgroup file system. You’ll want to update these agents to a version that supports cgroup v2.

If running as a stand-alone DaemonSet cAdvisor To monitor pods and containers, you need to update them to v0.43.0 or later.

If you use the JDK, JDK 11.0.16 and later or JDK 15 and later is recommended toFull support for cgroup v2。

Done 🎉🎉🎉

summary

Kubernetes cgroup2 features officially stable. cgroup2 has the following advantages over cgroup v1:

A single, unified hierarchical design in the API
More secure subtrees are delegated to containers
Updated features, egPressure Stall Information (PSI)
Enhanced resource allocation management and isolation across multiple resources
- Unified accounting of different types of memory allocations (network memory, kernel memory, etc.)
- Consider non-immediate resource changes, such as page cache writeback

It is recommended to use Linux and CRI that support cgroup v2 when using Kubernetes v1.25 and above. And enable the cgroup v2 feature of Kubernetes.