Engineering Note
etcd backup and restore on kubeadm clusters
Working notes on etcd snapshot and restore for kubeadm Kubernetes clusters: exact commands, the data-dir and cluster-token gotchas, and verification steps.
TL;DR
etcd is the only stateful component in a Kubernetes control plane, so an etcd snapshot is a backup of the entire cluster state. This note records the snapshot and restore commands that actually work on kubeadm clusters, the two mistakes that wreck most first-time restores — restoring into the live data directory and reusing the old cluster token — and how to prove the cluster came back consistent instead of assuming it.
Taking the snapshot
On a kubeadm cluster, etcd runs as a static pod and its certificates live under
/etc/kubernetes/pki/etcd/. Snapshot from the control-plane node itself,
against a single endpoint (never a list — snapshot save is a per-member
operation):
ETCDCTL_API=3 etcdctl snapshot save /var/backups/etcd/snap-$(date +%F-%H%M).db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
If etcdctl isn’t installed on the host, exec into the etcd pod
(kubectl -n kube-system exec -it etcd-<node> -- sh) — the binary is there and
the cert paths are identical.
Verify the file immediately (with etcd 3.5+, offline snapshot inspection moved
to etcdutl):
etcdutl snapshot status /var/backups/etcd/snap-2025-11-08-0300.db -w table
Sane hash, revision, and total-key count means the file is worth shipping off the node. And ship it: a snapshot sitting on the same disk as etcd is not a backup, it’s a copy waiting to die with the original — the same rule I lean on in backup and recovery security.
Restoring — where people get burned
A restore doesn’t repair the existing member — it builds a new one from the snapshot. Miss that distinction and you land on one of the two gotchas that eat most first attempts:
Gotcha 1: the data directory. Never restore into the live
/var/lib/etcd. Restore into a fresh path, then repoint the manifest:
etcdutl snapshot restore /var/backups/etcd/snap-2025-11-08-0300.db \
--data-dir /var/lib/etcd-from-backup
Gotcha 2: the cluster token. On a multi-member restore, every member must
be restored with the same new --initial-cluster-token. The token is what
prevents a restored member from accidentally rejoining ghosts of the old
cluster:
etcdutl snapshot restore snap.db \
--data-dir /var/lib/etcd-from-backup \
--name cp1 \
--initial-cluster cp1=https://10.0.0.11:2380,cp2=https://10.0.0.12:2380,cp3=https://10.0.0.13:2380 \
--initial-cluster-token etcd-restore-2025-11-08 \
--initial-advertise-peer-urls https://10.0.0.11:2380
The full sequence on a kubeadm control plane:
- Stop the API server so nothing writes mid-restore:
mv /etc/kubernetes/manifests/kube-apiserver.yaml /root/(kubelet kills the pod within seconds). - Run the restore into the new data-dir (above).
- Edit
/etc/kubernetes/manifests/etcd.yaml: change thehostPathvolume foretcd-datafrom/var/lib/etcdto/var/lib/etcd-from-backup(and--initial-cluster-tokenif you set one). - Wait for the etcd pod to come healthy, then move the apiserver manifest back.
Verification
Don’t declare victory on “the pods are Running”:
ETCDCTL_API=3 etcdctl endpoint health --cluster \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
etcdctl member list -w table # same flags
kubectl get nodes
kubectl get pods -A --sort-by=.metadata.creationTimestamp | tail -20
Expect controller churn: anything created after the snapshot no longer exists in etcd, so kubelets will kill those pods and Deployments will reconcile back to snapshot-time state. That’s correct behavior — but it’s why you check the newest objects, not the oldest. PVCs created after the snapshot deserve special attention: the volume may still exist in your storage backend with no object referencing it.
What I write down
Per cluster: snapshot schedule and retention, the off-node copy destination, the exact restore command with real member names/IPs pre-filled, and the date of the last tested restore. An untested etcd backup is a hypothesis. My Kubernetes troubleshooting method applies here too: rehearse the restore on a lab cluster before you need it at 3 a.m.
Frequently asked questions
- How often should I snapshot etcd?
- For most clusters, every 30–60 minutes via cron or a CronJob, retained for a few days, plus a copy shipped off the node. etcd snapshots are cheap — single-digit seconds and usually well under a gigabyte — so the limiting factor is how much cluster-state change you can afford to lose, not the cost of taking them.
- Does restoring etcd restore my application data?
- No. An etcd restore recovers Kubernetes objects — Deployments, Services, Secrets, ConfigMaps, PVC definitions — not the contents of persistent volumes. Volume data lives in your storage backend and needs its own backup path. Restoring etcd from an old snapshot can also orphan volumes created after the snapshot.
- Why does the restore need a new data directory?
- etcdutl snapshot restore builds a brand-new member from the snapshot. If you point it at the existing data-dir, it either refuses or you risk mixing old WAL state with the restored keyspace. Restore into a fresh directory, then repoint the etcd static pod manifest at it.