Velero series (5): Kubernetes cluster backup disaster recovery production best practices based on Velero

This article was last updated on: February 7, 2024 pm

Velero

Dimension of consideration Based on CSI snapshot Copy based on Restic files
App Performance Impact Low, the CSI interface calls Storage System Snapshot Depends on the amount of data, consumes additional resources
Data Availability Depends on storage system Object storage and production environment isolation, independent availability, support cross-site availability
Data consistency Support Crash Consistency, with hook mechanism to achieve consistency No guarantees, based on hook

Best practices

High-frequency local snapshots + low-frequency RESTIC backup to S3

Select the appropriate backup granularity and backup strategy from the application perspective

Prevent conflicts when sharing the same object store in a multi-cluster environment

pit

Deleting a backup or restore task that has not been completed for a long time causes Velero to block the inability to process subsequent tasks

QA

How do velero snapshots compare to snapshots provided by enterprise storage, such as NetApp?

A: Compared with enterprise-level snapshots, Velero can be implemented from an application perspective.

In addition, if you back up to S3, you can use hooks to achieve consistency.

A recommended best practice: take a snapshot first, and then copy the S3 data in the background in the early morning.

Series of articles

📚️ Reference documentation


Velero series (5): Kubernetes cluster backup disaster recovery production best practices based on Velero
https://e-whisper.com/posts/22436/
Author
east4ming
Posted on
May 25, 2022
Licensed under