Rancher Series - Rancher Upgrade

This article was last updated on: February 7, 2024 pm

overview

Previously, a 1 master (and etcd) 3 node K3S cluster was installed on 4 machines, and Rancher version 2.6.3 was installed on Helm.

A few days ago, I found that the latest version officially recommended by Rancher is:v2.6.4

So it was decided to upgrade the Rancher and K3S clusters one after another.

According to official recommendations, it is planned to:

  1. Upgrade Rancher from v2.6.3 to v2.6.4
  2. Upgrade the K3S cluster from v1.21.7+k3s1 to v1.22.5+k3s2

This article is an upgrade record for Rancher.

The basic information of Rancher in this upgrade is:

  1. Rancher v2.6.3
  2. Install online with Helm 3
  3. Use cert-manager(v1.7.1) + let’s encrypt to manage certificates

Upgrade steps

Back up the Kubernetes cluster running Rancher Server

use Back up the application to back up Rancher.

If something goes wrong during the upgrade process, you’ll use the backup as a recovery point.

The backup result is as follows:

Rancher 界面 Backup 结果

对象存储中的 Backup 对象

Update the Helm Chart repository

  1. Update the local helm cache.

    1
    helm repo update
  2. Gets the name of the repository used to install Rancher.

    For more information about repositories and their differences, see Helm Chart Repositories

    • Latest: Recommended for trying out the latest features
    • Stable: Recommended for production environments (📝 I use this)
    • Alpha: An experimental preview of an upcoming release

    Please add the <CHART_REPO>to be replaced with lateststable or alpha

    1
    2
    3
    4
    5
    6
    7
    8
    $ helm repo list

    NAME URL
    bitnami https://charts.bitnami.com/bitnami
    grafana https://grafana.github.io/helm-charts
    aliyuncs https://apphub.aliyuncs.com
    rancher-stable http://rancher-mirror.oss-cn-beijing.aliyuncs.com/server-charts/stable
    prometheus-community https://prometheus-community.github.io/helm-charts
  3. Get the latest chart from the Helm chart library to install Rancher.

    The command extracts the latest chart and uses it as .tgz The file is saved in the current directory. You can do this by adding --version= tag to get the chart to upgrade to a specific version. As follows:

    1
    helm fetch rancher-stable/rancher --version=v2.6.4

Upgrade Rancher

Use Helm to upgrade Rancher’s normal (internet-connected) installation.

Get it from the currently installed Rancher Helm chart --set The value passed.

1
2
3
4
5
6
7
8
$ helm get values rancher -n cattle-system
USER-SUPPLIED VALUES:
hostname: rancher.e-whisper.com
ingress:
tls:
source: letsEncrypt
replicas: 1
systemDefaultRegistry: registry.cn-hangzhou.aliyuncs.com

🐾 Notes:

Because my cluster is for testing or demo purposes, so replicas Set to 1

Append all values from the previous step to the command with --set key=value.

1
2
3
4
5
6
7
helm upgrade rancher rancher-stable/rancher \
--namespace cattle-system \
--set hostname=rancher.e-whisper.com \
--set ingress.tls.source=letsEncrypt \
--set replicas=1 \
--set systemDefaultRegistry=registry.cn-hangzhou.aliyuncs.com \
--version=2.6.4

4. Verify whether the upgrade is successful

Log in to Rancher and confirm that the upgrade was successful.

Rancher 升级 v2.6.4 成功

🎉🎉🎉

However, several problems were also found in the verification process, which are described and solved below.

Issues that occur after the upgrade

  • The helm upgrade fails with an error rendered manifests contain a resource that already exists
  • Managed clusters home-k3s Unable to connect.

Helm failed to upgrade Rancher

issue

The error is reported as follows:

Error: UPGRADE FAILED: rendered manifests contain a resource that already exists. 
Unable to continue with update: Secret "bootstrap-secret" in namespace "cattle-system" exists and cannot be imported into the current release: invalid ownership metadata; 
label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; 
annotation validation error: missing key "meta.helm.sh/release-name": must be set to "rancher"; 
annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "cattle-system"

Workaround

GitHub searched for relevant issues and found yes Bug for v2.6.4, Workaround measures:

First remove the key, then run the helm installation again:

1
kubectl delete secret -n cattle-system bootstrap-secret
1
2
3
4
5
6
7
helm upgrade rancher rancher-stable/rancher \
--namespace cattle-system \
--set hostname=rancher.e-whisper.com \
--set ingress.tls.source=letsEncrypt \
--set replicas=1 \
--set systemDefaultRegistry=registry.cn-hangzhou.aliyuncs.com \
--version=2.6.4

Problem solving.

Managed clusters home-k3s Unable to connect

issue

Post-upgrade discovery: Managed clusters home-k3s Unable to connect, as shown below:

受管集群无法连接

Log in to the managed cluster and view cattle-cluster-agent , found that the error message indicates that the format of the image is incorrect, and the image in x86_64 format is pulled.

This is because it was added when Helm was installed earlier systemDefaultRegistry=registry.cn-hangzhou.aliyuncs.com This parameter, while registry.cn-hangzhou.aliyuncs.com The image library only has images in x86_64 format, not images in arm64 format, and mine home-k3s It is installed on the Raspberry Pi 4.

Workaround

Remove Helm’ssystemDefaultRegistry=registry.cn-hangzhou.aliyuncs.com To configure, perform upgrade, as follows:

1
2
3
4
5
helm upgrade rancher rancher-stable/rancher \
--namespace cattle-system \
--set hostname=rancher.e-whisper.com \
--set ingress.tls.source=letsEncrypt \
--set replicas=1

After successful execution, it was found that Helm’s configuration had changed, but Rancher’s systemDefaultRegistry But still registry.cn-hangzhou.aliyuncs.com.

Here you find the Rancher interface shown below - set by env value:

Rancher 界面 systemDefaultRegistry 显示

The final finding is the configuration here:

1
2
3
4
5
6
7
8
apiVersion: management.cattle.io/v3
kind: Setting
metadata:
name: system-default-registry
customized: false
default: ''
source: ''
value: 'registry.cn-hangzhou.aliyuncs.com'

Delete this yaml or will value Replace it with:value: '', and restart Rancher, after the reboot takes effect, found 'registry.cn-hangzhou.aliyuncs.com' to be removed.

Problem solving.

📚️ Reference documentation