Velero Series (4): Production Migration with Velero
This article was last updated on: July 24, 2024 am
overview
objective
Pass velero tools, to achieve the following overall goals:
- Specific namespaces are migrated between clusters B and A;
The specific objectives are:
- Created on a B A cluster velero (Including.) restic )
- backup Cluster B Specific namespaces :
caseycui2020
:- Backup resources - such as deployments, configmaps, etc.
- Before backing up, exclude specifics
secrets
yaml.
- Before backing up, exclude specifics
- Back up volume data; (via restic)
- By “opt-in”, only specific pod volumes are backed up
- Backup resources - such as deployments, configmaps, etc.
- Migrate specific namespaces to Cluster A :
caseycui2020
:- Migration resources - pass
include
way, migrate only specific resources; - Migrate volume data. (via ResiC)
- Migration resources - pass
Installation
-
Create a Velero-specific credentials file in your local directory (
credentials-velero
):Object storage using XSKY: (The company’s NetApp object storage is not compatible)
1
2
3[default]
aws_access_key_id = xxxxxxxxxxxxxxxxxxxxxxxx
aws_secret_access_key = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx -
(openshift) You need to create a namespace first:
velero
:oc new-project velero
-
By default, the user-dimensioned openshift namespace does not schedule pods on all nodes in the cluster.
To schedule namespaces on all nodes, a comment is required:
1
oc annotate namespace velero openshift.io/node-selector=""
This should be done before installing Velero.
-
Start the server and storage services. In the Velero directory, run:
1
2
3
4
5
6
7
8
9velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.0.0 \
--bucket velero \
--secret-file ./credentials-velero \
--use-restic \
--use-volume-snapshots=true \
--backup-location-config region="default",s3ForcePathStyle="true",s3Url="http://glacier.e-whisper.com",insecureSkipTLSVerify="true",signatureVersion="4" \
--snapshot-location-config region="default"The content created includes:
CustomResourceDefinition/backups.velero.io: attempting to create resource CustomResourceDefinition/backups.velero.io: created CustomResourceDefinition/backupstoragelocations.velero.io: attempting to create resource CustomResourceDefinition/backupstoragelocations.velero.io: created CustomResourceDefinition/deletebackuprequests.velero.io: attempting to create resource CustomResourceDefinition/deletebackuprequests.velero.io: created CustomResourceDefinition/downloadrequests.velero.io: attempting to create resource CustomResourceDefinition/downloadrequests.velero.io: created CustomResourceDefinition/podvolumebackups.velero.io: attempting to create resource CustomResourceDefinition/podvolumebackups.velero.io: created CustomResourceDefinition/podvolumerestores.velero.io: attempting to create resource CustomResourceDefinition/podvolumerestores.velero.io: created CustomResourceDefinition/resticrepositories.velero.io: attempting to create resource CustomResourceDefinition/resticrepositories.velero.io: created CustomResourceDefinition/restores.velero.io: attempting to create resource CustomResourceDefinition/restores.velero.io: created CustomResourceDefinition/schedules.velero.io: attempting to create resource CustomResourceDefinition/schedules.velero.io: created CustomResourceDefinition/serverstatusrequests.velero.io: attempting to create resource CustomResourceDefinition/serverstatusrequests.velero.io: created CustomResourceDefinition/volumesnapshotlocations.velero.io: attempting to create resource CustomResourceDefinition/volumesnapshotlocations.velero.io: created Waiting for resources to be ready in cluster... Namespace/velero: attempting to create resource Namespace/velero: created ClusterRoleBinding/velero: attempting to create resource ClusterRoleBinding/velero: created ServiceAccount/velero: attempting to create resource ServiceAccount/velero: created Secret/cloud-credentials: attempting to create resource Secret/cloud-credentials: created BackupStorageLocation/default: attempting to create resource BackupStorageLocation/default: created VolumeSnapshotLocation/default: attempting to create resource VolumeSnapshotLocation/default: created Deployment/velero: attempting to create resource Deployment/velero: created DaemonSet/restic: attempting to create resource DaemonSet/restic: created Velero is installed! ⛵ Use 'kubectl logs deployment/velero -n velero' to view the status.
-
(openshift) will
velero
ServiceAccount added toprivileged
SCC:1
$ oc adm policy add-scc-to-user privileged -z velero -n velero
-
(openshift) For OpenShift version >= 4.1, modify DaemonSet yaml to request
privileged
Mode:1
2
3
4
5
6@@ -67,3 +67,5 @@ spec:
value: /credentials/cloud
- name: VELERO_SCRATCH_DIR
value: /scratch
+ securityContext:
+ privileged: trueOr:
1
2
3
4oc patch ds/restic \
--namespace velero \
--type json \
-p '[{"op":"add","path":"/spec/template/spec/containers/0/securityContext","value": { "privileged": true}}]'
Backup - B cluster
Back up specific resources at the cluster level
1 |
|
View the backup
1 |
|
Back up specific namespaces caseycui2020
Exclude specific resources
The label isvelero.io/exclude-from-backup=true
The resource is not included in the backup, even if it contains matching selector labels.
In this way, there is no need for backupssecret
and other resources throughvelero.io/exclude-from-backup=true
label to exclude.
Excluded in this waysecret
Some examples are as follows:
builder-dockercfg-jbnzr
default-token-lshh8
pipeline-token-xt645
Back up Pod Volume using restic
🐾 Note:
Under this namespace, the following 2 pod volumes also need to be backed up, but they are not yet officially used:
mycoreapphttptask-callback
mycoreapphttptaskservice-callback
Pass “Selective Enable” Way to make selective backups.
-
Run the following command for each pod that contains the volume you want to back up:
1
2oc -n caseycui2020 annotate pod/<mybackendapp-pod-name> backup.velero.io/backup-volumes=jmx-exporter-agent,pinpoint-agent,my-mybackendapp-claim
oc -n caseycui2020 annotate pod/<elitegetrecservice-pod-name> backup.velero.io/backup-volumes=uploadfilewhere volume name is the name of the volume in the container spec.
For example, for the following pods:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20apiVersion: v1
kind: Pod
metadata:
name: sample
namespace: foo
spec:
containers:
- image: k8s.gcr.io/test-webserver
name: test-webserver
volumeMounts:
- name: pvc-volume
mountPath: /volume-1
- name: emptydir-volume
mountPath: /volume-2
volumes:
- name: pvc-volume
persistentVolumeClaim:
claimName: test-volume-claim
- name: emptydir-volume
emptyDir: {}You should run:
1
kubectl -n foo annotate pod/sample backup.velero.io/backup-volumes=pvc-volume,emptydir-volume
If you use a controller to manage your pods, you can also provide this annotation in the pod template spec.
Backup and verification
Back up the namespace and its objects, and the pod volume with the associated annotation:
1 |
|
View the backup
1 |
|
The results of the described view are as follows:
1 |
|
Back up regularly
To create a regularly scheduled backup using a cron expression:
1 |
|
Alternatively, you can use some non-standard shorthand cron expressions:
1 |
|
For more usage examples, see Cron packagedocumentation.
Cluster Migration - To Cluster A
use Backups and Restores
As long as you point each Velero instance to the same cloud object storage location, Velero can help you migrate resources from one cluster to another. This scenario assumes that your cluster is hosted by the same cloud provider. Note that Velero natively does not support migration of persistent volume snapshots across cloud providers. If you want to migrate volume data between cloud platforms, enable restic, which will back up the volume contents at the file system level.
-
(Cluster B) Suppose you are not already using Velero
schedule
Operations to checkpoint data requires that you first back up the entire cluster (replace as needed.)<BACKUP-NAME>
):1
velero backup create <BACKUP-NAME>
The default backup retention period is expressed in TTL (validity period) and is 30 days (720 hours); You can use it
--ttl <DURATION>
The flag is changed as needed. For more information about backup expiration, see How velero works。 -
(Cluster A) configuration
BackupStorageLocations
andVolumeSnapshotLocations
point to Cluster 1 Where used, usedvelero backup-location create
andvelero snapshot-location create
. To ensure configurationBackupStorageLocations
For read-only, pass byvelero backup-location create
to use--access-mode=ReadOnly
flag (because I only have one bucket, I don’t configure read-only). The following is installed in Cluster A, which is configured during installationBackupStorageLocations
andVolumeSnapshotLocations
.1
2
3
4
5
6
7
8
9velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.0.0 \
--bucket velero \
--secret-file ./credentials-velero \
--use-restic \
--use-volume-snapshots=true \
--backup-location-config region="default",s3ForcePathStyle="true",s3Url="http://glacier.e-whisper.com",insecureSkipTLSVerify="true",signatureVersion="4"\
--snapshot-location-config region="default" -
(Cluster A) Ensure that the Velero Backup object has been created. Velero resources are synced with backup files in cloud storage.
1
velero backup describe <BACKUP-NAME>
note: The default sync interval is 1 minute, so make sure to wait before checking. You can use the Velero server
--backup-sync-period
Flag to configure this interval. -
(Cluster A) Once it is confirmed that the correct backup now exists (
<BACKUP-NAME>
), you can restore everything using: (becausebackup
In onlycaseycui2020
A namespace, so restore is not needed--include-namespaces caseycui2020
for filtration)1
velero restore create --from-backup caseycui2020 --include-resources buildconfigs.build.openshift.io,configmaps,deploymentconfigs.apps.openshift.io,imagestreams.image.openshift.io,imagestreamtags.image.openshift.io,imagetags.image.openshift.io,limitranges,namespaces,networkpolicies.networking.k8s.io,persistentvolumeclaims,prometheusrules.monitoring.coreos.com,resourcequotas,rolebindimybackendapp.authorization.openshift.io,rolebindimybackendapp.rbac.authorization.k8s.io,routes.route.openshift.io,secrets,servicemonitors.monitoring.coreos.com,services,templateinstances.template.openshift.io
Because later verified
persistentvolumeclaims
targetrestore
There is a problem, so remove this PVC when using it later, and then find a way to solve it later:1
velero restore create --from-backup caseycui2020 --include-resources buildconfigs.build.openshift.io,configmaps,deploymentconfigs.apps.openshift.io,imagestreams.image.openshift.io,imagestreamtags.image.openshift.io,imagetags.image.openshift.io,limitranges,namespaces,networkpolicies.networking.k8s.io,persistentvolumeclaims,prometheusrules.monitoring.coreos.com,resourcequotas,rolebindimybackendapp.authorization.openshift.io,rolebindimybackendapp.rbac.authorization.k8s.io,routes.route.openshift.io,secrets,servicemonitors.monitoring.coreos.com,services,templateinstances.template.openshift.io
Verify the 2 clusters
Check that the second cluster is working as expected:
-
(Cluster A) Run:
1
velero restore get
The results are as follows:
NAME BACKUP STATUS STARTED COMPLETED ERRORS WARNImybackendapp CREATED SELECTOR caseycui2020-20201021102342 caseycui2020 Failed <nil> <nil> 0 0 2020-10-21 10:24:14 +0800 CST <none> caseycui2020-20201021103040 caseycui2020 PartiallyFailed <nil> <nil> 46 34 2020-10-21 10:31:12 +0800 CST <none> caseycui2020-20201021105848 caseycui2020 InProgress <nil> <nil> 0 0 2020-10-21 10:59:20 +0800 CST <none>
-
Then run:
1
2velero restore describe <RESTORE-NAME-FROM-GET-COMMAND>
oc -n velero get podvolumerestores -l velero.io/restore-name=YOUR_RESTORE_NAME -o yamlThe results are as follows:
Name: caseycui2020-20201021102342 Namespace: velero Labels: <none> Annotations: <none> Phase: InProgress Started: <n/a> Completed: <n/a> Backup: caseycui2020 Namespaces: Included: all namespaces found in the backup Excluded: <none> Resources: Included: buildconfigs.build.openshift.io, configmaps, deploymentconfigs.apps.openshift.io, imagestreams.image.openshift.io, imagestreamtags.image.openshift.io, imagetags.image.openshift.io, limitranges, namespaces, networkpolicies.networking.k8s.io, persistentvolumeclaims, prometheusrules.monitoring.coreos.com, resourcequotas, rolebindimybackendapp.authorization.openshift.io, rolebindimybackendapp.rbac.authorization.k8s.io, routes.route.openshift.io, secrets, servicemonitors.monitoring.coreos.com, services, templateinstances.template.openshift.io Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io Cluster-scoped: auto Namespace mappimybackendapp: <none> Label selector: <none> Restore PVs: auto
If you run into problems, make sure Velero is running in the same namespace in both clusters.
I ran into a problem here, that is, openshift, imagestream and imagetag, and then the corresponding image could not be pulled, and the container did not start.
The container did not start, and the podvolume did not recover successfully.
Name: caseycui2020-20201021110424
Namespace: velero
Labels: <none>
Annotations: <none>
Phase: PartiallyFailed (run 'velero restore logs caseycui2020-20201021110424' for more information)
Started: <n/a>
Completed: <n/a>
Warnimybackendapp:
Velero: <none>
Cluster: <none>
Namespaces:
caseycui2020: could not restore, imagetags.image.openshift.io "mybackendapp:1.0.0" already exists. Warning: the in-cluster version is different than the backed-up version.
could not restore, imagetags.image.openshift.io "mybackendappno:1.0.0" already exists. Warning: the in-cluster version is different than the backed-up version.
...
Errors:
Velero: <none>
Cluster: <none>
Namespaces:
caseycui2020: error restoring imagestreams.image.openshift.io/caseycui2020/mybackendapp: ImageStream.image.openshift.io "mybackendapp" is invalid: []: Internal error: imagestreams "mybackendapp" is invalid: spec.tags[latest].from.name: Invalid value: "mybackendapp@sha256:6c5ab553a97c74ad602d2427a326124621c163676df91f7040b035fa64b533c7": error generating tag event: imagestreamimage.image.openshift.io ......
Backup: caseycui2020
Namespaces:
Included: all namespaces found in the backup
Excluded: <none>
Resources:
Included: buildconfigs.build.openshift.io, configmaps, deploymentconfigs.apps.openshift.io, imagestreams.image.openshift.io, imagestreamtags.image.openshift.io, imagetags.image.openshift.io, limitranges, namespaces, networkpolicies.networking.k8s.io, persistentvolumeclaims, prometheusrules.monitoring.coreos.com, resourcequotas, rolebindimybackendapp.authorization.openshift.io, rolebindimybackendapp.rbac.authorization.k8s.io, routes.route.openshift.io, secrets, servicemonitors.monitoring.coreos.com, services, templateinstances.template.openshift.io
Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io
Cluster-scoped: auto
Namespace mappimybackendapp: <none>
Label selector: <none>
Restore PVs: auto
Summary of migration issues
The current summary of the issues is as follows:
-
imagestreams.image.openshift.io, imagestreamtags.image.openshift.io, imagetags.image.openshift.io
The image in was not imported successfully; Exactlylatest
This tag was not imported successfully.imagestreamtags.image.openshift.io
It also takes time to take effect. -
persistentvolumeclaims
After migration, an error is reported, and the error is reported as follows:1
phase: Lost
The reason is: the configuration of the StorageClass of cluster A and B is different, so the PVC of cluster B is impossible to bind directly in cluster A. Moreover, it cannot be directly modified after creation, and it needs to be deleted and recreated.
-
Routes
Domain name, some domain names are domain names specific to A B cluster, such as:jenkins-caseycui2020.b.caas.e-whisper.com
Migrate to Cluster A to:jenkins-caseycui2020.a.caas.e-whisper.com
-
podVolume
Data is not migrated.
latest
This tag was not imported successfully
To import manually, the command is as follows: (1.0.1
is the latest version of ImageStream)
1 |
|
PVC phase Lost problem
If created manually, PVC yaml needs to be adjusted. The PVC before and after adjustment is as follows:
Cluster B original YAML:
1 |
|
After adjustment:
1 |
|
podVolume
Data is not migrated
You can migrate manually, with the following command:
1 |
|
summary
This article was written earlier, and OpenShift came out with a proprietary migration tool based on OpenShift wrapped in Velero, which can be migrated directly through the tools it provides.
In addition, there are many restrictions on OpenShift clusters, and there are also many resources exclusive to OpenShift, so the difference between actual use and standard K8S is still relatively large, and you need to pay careful attention.
Although the attempt failed, the ideas are still available for reference.