Recovery
This guide explains how to restore your system to a previous state using backup data.
- System failure: Data loss due to server failure, disk corruption, etc.
- Accidental deletion: Important resources or data were mistakenly deleted .
- Update rollback: Rolling back a problematic update to a previous state .
- Environment replication: Using backups to set up an identical environment elsewhere
Recovery Types
KIWI provides two recovery methods depending on your environment.
- etcd Recovery: For Kubernetes, restores full cluster state. Time required: minutes to tens of minutes.
- Docker Recovery: For Docker, restores containers, volumes, and images. Time required: varies by item.
- Recovery will overwrite current data. Create a backup of the current state if needed.
- If possible, perform recovery during low-traffic hours.
- For production environments, validate recovery in a test environment first.
etcd Recovery
Restore your Kubernetes cluster from an etcd snapshot.
Step 1: Select Backup
- Click [Backup Management] in the left menu
- Find the etcd backup you want to restore in the backup list
- Click to select it
If you have multiple backups, select the one from just before the problem occurred. Selecting a backup that's too old will result in losing all changes made after that point.
Step 2: Verify the Restore Target
Before restoring, verify the following information.
- Backup date/time: The cluster will be reverted to this point in time
- Cluster: Confirm this is the correct target cluster .
- Status: Verify the backup file shows "Normal" status .
Corrupted backups cannot be used for restoration. Select another valid backup or re-verify the backup file integrity.
Step 3: Execute Restore
- Click the Restore button .
- Click Start Restore in the confirmation dialog
- Restore progress will be displayed on screen
During etcd restoration, the cluster will be temporarily unavailable. All cluster operations including Pod scheduling and API calls will be impossible during this time. Restore time can range from minutes to tens of minutes depending on data size.
Step 4: Verify Cluster Status
After restoration completes, verify that the cluster is operating normally.
# Check node status - all nodes should be Ready
kubectl get nodes
# Check Pod status in all namespaces
kubectl get pods --all-namespaces
# Check system Pod status
kubectl get pods -n kube-system
- Are all nodes in
Readystate? - Are all kube-system Pods in
Runningstate? - Are application Pods running normally?
- Can you access services normally?
Docker Recovery
Restore Docker volumes, images, and containers from backups.
Volume Recovery
Restore volumes containing application data.
Step 1: Select Backup
- Click the Docker backup tab on the [Backup Management] page
- Select the backup containing the volume you want to restore
Step 2: Select Items to Restore
- Volume: Select the volume to restore
- Path: Specify the restore location
.
- Original location: Overwrite the existing volume
- New location: Restore with a different name (preserves existing data)
When restoring volumes in production, it's safer to first restore to a new location to verify the data, then copy to the original location after validation.
Step 3: Execute Restore
- Click the Restore button .
- Monitor the restore progress .
Image Recovery
Restore backed-up Docker images.
Step 1: Select Image Backup
- Select the image backup on the [Backup Management] page
- Select the image tar file to restore
Step 2: Load the Image
KIWI internally executes the following command to restore the image:
# KIWI internal operation
docker load -i image_backup.tar
Restored images retain their original tags from the backup point. If an image with the same tag already exists, it will be overwritten.
Recovery Verification
After recovery, always verify that the system is operating correctly.
Verification Checklist
- Service status: Check Pod/container status. Expected state: Running.
- Data integrity: Verify application data. Expected state: Data exists from backup point.
- Network: Test service access. Expected state: Normal response.
- Logs: Check application logs. Expected state: No errors.
For critical systems, prepare scripts that automatically perform health checks after recovery to reduce verification time.
Troubleshooting
Restore Failure: "snapshot file corrupted"
restore failed: snapshot file corrupted
Why does this happen?
The backup file is corrupted and cannot be used for restoration. A disk error may have occurred during backup creation, or the file was corrupted after storage.
Resolution
- Use a different backup file: Select a valid backup from a different date .
- Verify backup file integrity: Check the checksum in the backup details .
- Check backup copies: If backups were replicated to another location, use that file .
Data Mismatch
This occurs when restored data differs from expectations, or applications produce errors.
Why does this happen?
- Data schema changed between the backup point and current time
- Linked data with external systems not included in the backup is inconsistent .
How to check and resolve
- Verify backup point: Confirm the restored data matches the backup point state .
- Check schema compatibility: Verify the application version is compatible with the data schema
- Sync external systems: Check if databases or other external systems also need restoration .
Restoring only part of a system can cause data inconsistencies. When possible, restore all related components together.