Rancher Article Series - K3S Cluster Upgrade

This article was last updated on: February 7, 2024 pm

overview

The book continues from the previous time:Rancher Series - Rancher UpgradeWe mentioned: Upgrading Rancher from v2.6.3 with Helm v2.6.4.

Next, start upgrading the K3S cluster: Upgrade the K3S cluster from v1.21.7+k3s1 to v1.22.5+k3s2

The basic information of the upgraded K3S cluster is:

  1. A 1 master (and etcd) 3 node K3S cluster installed on Tianyi Cloud with 4 machines
  2. In fact… This K3S cluster uses: k3s-ansible Script bulk installation.:
  3. K3S v1.21.7+k3s1
  4. Rancher just upgraded to v2.6.4, verification is not a big problem
  5. K3S clusters are useful for Traefik to manage Ingress
  6. K3S clusters use embedded etcd datastores

Upgrade mode assessment

official The following upgrade methods are available:

  • Base upgrade
    • Upgrade K3s using an installation script
    • Use the binaries to manually upgrade K3s
  • Automatic upgrades
    • Use Rancher to upgrade your K3s cluster
    • Use system-upgrad-controller to manage K3s cluster upgrades

I’ve probably gone through it all, let’s start with the reason for the pass:

Use Rancher to upgrade your K3s cluster - 🙅 ♂️

Detailed documentation here:Upgrade the Kubernetes version | Rancher | Rancher documentation

The original text is as follows:

📚️ Quote:

Prerequisite:

  1. fromBig pictureview, locate the cluster for which you want to upgrade the Kubernetes version. choose Ellipsis > edit
  2. ClickCluster options
  3. from Kubernetes versiondrop-down menu, select the version of Kubernetes that you want to use for your cluster.
  4. ClickSave

Outcome: The cluster starts upgrading its Kubernetes version.

But, but! I never found it on my Rancher v2.6.4 省略号 > 编辑 Where, 😂😂😂

I guess maybe it’s because the Chinese documentation I read is only Rancher v2.5, and the Rancher v2.6 UI has undergone a lot of tweaks, so I can’t find it.

In addition, this kind of Rancher’s local cluster, and still is Single master Node, my personal assessment is not achievableAutomatic upgradesTarget.

PAAS

Use system-upgrad-controller to manage K3s cluster upgrades 🙅 ♂️

Detailed documentation can be found here:Automatic upgrade | Rancher documentation

I tried it and it turned out to be in my creation server-plan , prompt me server-plan PODs cannot be scheduled because none of the nodes meet the conditions for scheduling.

I probably looked at it, and the condition of scheduling is that it is required to be in master node, I only have 1 master at the same time, which is set up before the upgrade cordon: true, resulting in a conflict and the upgrade could not proceed.

It is precisely because of this that I judge:

  • Single master node, is not possibleAutomatic upgrades, or even if an upgrade is possible, the risk is greater

PAAS

Upgrade K3s manually using binaries - 🙅 ♂️

This is OK, the steps are clear, and it is just right k3s-ansible Script increase upgrade.yml playbook to implement.

But… I don’t have time in the near future, so let’s write down this matter first, and then add this function later when I have time.

Upgrade K3s ✔️ - using an installation script

Although I didn’t install K3s with an install script, but k3s-ansible The logic of the script is basically the same as the official installation script, except that it uses ansible. After personal evaluation, it is believed: just make sureRerun the installation script with the same flagsto upgrade K3s from an older version.

It’s ✔️ up to you

Upgrade steps

〇、Information Collection

registries.yaml

There is a configuration of registries.yaml, as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
mirrors:
docker.io:
endpoint:
- "https://registry.cn-hangzhou.aliyuncs.com"
- "https://docker.mirrors.ustc.edu.cn"
configs:
'docker.io':
auth:
username: caseycui
password: <my-password>
'quay.io':
auth:
username: east4ming
password: <my-password>

But the location did not move, still /etc/rancher/k3s/registries.yaml. Therefore, this does not result in additional upgrade steps.

K3s Server and Agent Other Configurations

1
2
3
4
5
6
k3s_version: v1.21.7+k3s1
ansible_user: caseycui
systemd_dir: /etc/systemd/system
master_ip: "{{ hostvars[groups['master'][0]]['ansible_host'] | default(groups['master'][0]) }}"
extra_server_args: '--write-kubeconfig-mode "644" --cluster-init --disable-cloud-controller --tls-san <my-public-ip> --kube-apiserver-arg "feature-gates=EphemeralContainers=true" --kube-scheduler-arg "feature-gates=EphemeralContainers=true" --kube-apiserver-arg=default-watch-cache-size=1000 --kube-apiserver-arg=delete-collection-workers=10 --kube-apiserver-arg=event-ttl=30m --kube-apiserver-arg=max-mutating-requests-inflight=800 --kube-apiserver-arg=max-requests-inflight=1600 --etcd-expose-metrics=true'
extra_agent_args: ''

Analyzing the above configuration, it is just some more server installation configuration parameters, pay attention to ensure when using the official installation scriptRerun the installation script with the same flagsCan.

1. Backup

use k3s etcd-snapshot Make a backup as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
# k3s etcd-snapshot
INFO[2022-05-05T17:10:01.884597095+08:00] Managed etcd cluster bootstrap already complete and initialized
W0505 17:10:02.477542 2431147 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
W0505 17:10:02.923819 2431147 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
W0505 17:10:03.398185 2431147 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
INFO[2022-05-05T17:10:03.687171696+08:00] Saving etcd snapshot to /var/lib/rancher/k3s/server/db/snapshots/on-demand-4azlmvglqkx7migt-0002-1651741803
{"level":"info","msg":"created temporary db file","path":"/var/lib/rancher/k3s/server/db/snapshots/on-demand-4azlmvglqkx7migt-0002-1651741803.part"}
{"level":"info","ts":"2022-05-05T17:10:03.693+0800","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}
{"level":"info","msg":"fetching snapshot","endpoint":"https://127.0.0.1:2379"}
{"level":"info","ts":"2022-05-05T17:10:14.841+0800","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}
{"level":"info","msg":"fetched snapshot","endpoint":"https://127.0.0.1:2379","size":"327 MB","took":"11.182733612s"}
{"level":"info","msg":"saved","path":"/var/lib/rancher/k3s/server/db/snapshots/on-demand-4azlmvglqkx7migt-0002-1651741803"}
INFO[2022-05-05T17:10:14.879646814+08:00] Saving current etcd snapshot set to k3s-etcd-snapshots ConfigMap

📝Notes:

You can also add more parameters to back up data to S3.
The reason why I did not choose this time is because the Internet bandwidth of the cluster is too small, and the backup to S3 is frequently interrupted, so I give up.

The backup results are located at:/var/lib/rancher/k3s/server/db/snapshots/, as shown below:

K3s Backup 结果

Twok3s-killall.sh

In order to ensure the success rate of the upgrade, and the current K3s cluster is mainly used for testing and demo, it can be completely shut down, so it can be used k3s-killall.sh Stop the corresponding node and then upgrade.

Before upgrading the corresponding node, run the following command:

1
/usr/local/bin/k3s-killall.sh

Third, use the installation script to upgrade the server

🐾Notes:

To upgrade K3s from an older version, you can rerun the installation script with the same flags

Run the following command to upgrade:

1
curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | INSTALL_K3S_VERSION=v1.22.5+k3s2  INSTALL_K3S_MIRROR=cn K3S_KUBECONFIG_MODE=644 sh -s - --cluster-init --disable-cloud-controller --tls-san <my-public-ip> --kube-apiserver-arg "feature-gates=EphemeralContainers=true" --kube-scheduler-arg "feature-gates=EphemeralContainers=true"  --kube-apiserver-arg default-watch-cache-size=1000 --kube-apiserver-arg delete-collection-workers=10 --kube-apiserver-arg event-ttl=30m --kube-apiserver-arg max-mutating-requests-inflight=800 --kube-apiserver-arg max-requests-inflight=1600 --etcd-expose-metrics true

The instructions are as follows:

  • INSTALL_K3S_VERSION=v1.22.5+k3s2 The target version of the upgrade
  • K3S_KUBECONFIG_MODE=644 ... --etcd-expose-metrics true All are consistent with the previous installation flags

The upgrade is successful, and the log is as follows:

[INFO]  Using v1.22.5+k3s2 as release
[INFO]  Downloading hash https://rancher-mirror.rancher.cn/k3s/v1.22.5-k3s2/sha256sum-amd64.txt
[INFO]  Downloading binary https://rancher-mirror.rancher.cn/k3s/v1.22.5-k3s2/k3s
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Skipping /usr/local/bin/kubectl symlink to k3s, already exists
[INFO]  Skipping /usr/local/bin/crictl symlink to k3s, already exists
[INFO]  Skipping /usr/local/bin/ctr symlink to k3s, already exists
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO]  systemd: Starting k3s

4. Use the installation script to upgrade the agent

🐾Notes:

To upgrade K3s from an older version, you can rerun the installation script with the same flags

Run the following command to upgrade:

1
curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | INSTALL_K3S_VERSION=v1.22.5+k3s2  INSTALL_K3S_MIRROR=cn K3S_URL=https://<my-master-ip>:6443  K3S_TOKEN=<my-token> sh -s -

The instructions are as follows:

  • Others are similar to server upgrades, mainly the version and the same flags
  • K3S_URL=https://<my-master-ip>:6443 K3S_TOKEN=<my-token> This is a parameter required for installation as an agent
    • K3S_TOKEN Situated:/var/lib/rancher/k3s/server/node-token, the token has not changed before and after the upgrade

5. Verification

This can be verified through some kubectl commands, or a graphical interface Lens or K9S or Rancher to verify.

A cursory look at these places:

  • Events: There is no Warning
  • Node status: There are no exceptions
  • Pod status: There are no exceptions
  • Jobs Status: There are no failures
  • Ingress status: There are no access exceptions
  • PVC Status: Yes or no Bound of the state
  • kind: Addon Status There are no exceptions

🎉🎉🎉
However, several problems were also found during the verification process, which are described and solved one by one:

📚️ Reference documentation