Etcd v3 backup and recovery

Keywords: Database Kubernetes etcd

ETCD introduction

ETCD   It is a distributed and consistent KV storage system for shared configuration and service discovery. Etcd is an open source project initiated by CoreOS, and the license agreement is Apache.

ETCD usage scenario

ETCD has many usage scenarios, including but not limited to:

  • configuration management
  • Service registered on Discovery
  • Choose the master
  • Application scheduling
  • Distributed queue
  • Distributed lock

ETCD stores k8s all data information

etcd is a very important service of k8s cluster, which stores all data information of the cluster. Similarly, if a disaster occurs or etcd data is lost, cluster data recovery will be affected. Therefore, this article focuses on how to backup and restore data.

ETCD some query operations

  • View cluster status
$ ETCDCTL_API=3 etcdctl --cacert=/opt/kubernetes/ssl/ca.pem --cert=/opt/kubernetes/ssl/server.pem --key=/opt/kubernetes/ssl/server-key.pem --endpoints=,, endpoint health is healthy: successfully committed proposal: took = 1.698385ms is healthy: successfully committed proposal: took = 1.577913ms is healthy: successfully committed proposal: took = 5.616079ms
  • Get a key information
$ ETCDCTL_API=3 etcdctl --cacert=/opt/kubernetes/ssl/ca.pem --cert=/opt/kubernetes/ssl/server.pem --key=/opt/kubernetes/ssl/server-key.pem --endpoints=,, get /registry/
  • Get etcd version information
$ ETCDCTL_API=3 etcdctl --cacert=/opt/kubernetes/ssl/ca.pem --cert=/opt/kubernetes/ssl/server.pem --key=/opt/kubernetes/ssl/server-key.pem --endpoints=,, version
  • Get all ETCD key s
$ ETCDCTL_API=3 etcdctl --cacert=/opt/kubernetes/ssl/ca.pem --cert=/opt/kubernetes/ssl/server.pem --key=/opt/kubernetes/ssl/server-key.pem --endpoints=,, get / --prefix --keys-only


host   IP
  • ETCD version 3.2.12
  • Kubernetes version v1.15.6 binary installation


Note: the etcdctl commands of different versions of ETCD are different, but they are roughly the same, which is used for backup in this article   napshot save  , Just back up one node at a time.

Command backup (backup on k8s-master1 machine):

$ ETCDCTL_API=3 etcdctl --cacert=/opt/kubernetes/ssl/ca.pem --cert=/opt/kubernetes/ssl/server.pem --key=/opt/kubernetes/ssl/server-key.pem --endpoints= snapshot save /data/etcd_backup_dir/etcd-snapshot-`date +%Y%m%d`.db

Backup script (backup on k8s-master1 machine):

#!/usr/bin/env bash



ETCDCTL_API=3 etcdctl \
--cacert="${CACERT}" --cert="${CERT}" --key="${EKY}" \
--endpoints=${ENDPOINTS} \
snapshot save /data/etcd_backup_dir/etcd-snapshot-`date +%Y%m%d`.db

# Backup retention for 30 days
find /data/etcd_backup_dir/ -name *.db -mtime +30 -exec rm -f {} \;



  • Stop on all masters   kube-apiserver   service
$ systemctl stop kube-apiserver  

# Confirm whether the Kube apiserver service is stopped 
$ ps -ef | grep kube-apiserver
  • Stop all ETCD services in the cluster
$ systemctl stop etcd
  • Remove all data in ETCD storage directory
$ mv /var/lib/etcd/default.etcd /var/lib/etcd/default.etcd.bak
  • Copy ETCD backup snapshot
# Copy backup from k8s-master1 machine 
$ scp /data/etcd_backup_dir/etcd-snapshot-20191222.db root@k8s-master2:/data/etcd_backup_dir/ 
$ scp /data/etcd_backup_dir/etcd-snapshot-20191222.db root@k8s-master3:/data/etcd_backup_dir/

Restore backup

# Operation on k8s-master1 machine
$ ETCDCTL_API=3 etcdctl snapshot restore /data/etcd_backup_dir/etcd-snapshot-20191222.db \
  --name etcd-0 \
  --initial-cluster "etcd-0=,etcd-1=,etcd-2=" \
  --initial-cluster-token etcd-cluster \
  --initial-advertise-peer-urls \
# Operation on k8s-master2 machine
$ ETCDCTL_API=3 etcdctl snapshot restore /data/etcd_backup_dir/etcd-snapshot-20191222.db \
  --name etcd-1 \
  --initial-cluster "etcd-0=,etcd-1=,etcd-2="  \
  --initial-cluster-token etcd-cluster \
  --initial-advertise-peer-urls \
# Operation on k8s-master3 machine
$ ETCDCTL_API=3 etcdctl snapshot restore /data/etcd_backup_dir/etcd-snapshot-20191222.db \
  --name etcd-2 \
  --initial-cluster "etcd-0=,etcd-1=,etcd-2="  \
  --initial-cluster-token etcd-cluster \
  --initial-advertise-peer-urls \

After the above three etcds are recovered, log in to the three machines in turn to start the ETCD

$ systemctl start etcd

After three etcds are started, check the status of ETCD cluster

$ ETCDCTL_API=3 etcdctl --cacert=/opt/kubernetes/ssl/ca.pem --cert=/opt/kubernetes/ssl/server.pem --key=/opt/kubernetes/ssl/server-key.pem --endpoints=,, endpoint health

All three etcds are healthy. Start Kube apiserver at each Master

$ systemctl start kube-apiserver

Check whether the Kubernetes cluster returns to normal

$ kubectl get cs


Kubernetes cluster backup mainly backs up ETCD clusters. During recovery, the whole sequence of recovery is mainly considered:

Stop Kube apiserver -- > stop etcd -- > recover data -- > start etcd -- > start Kube apiserver

Note: when backing up an ETCD cluster, only one ETCD needs to be backed up. When recovering, take the same backup data for recovery.

Posted by crosbystillsnas on Sat, 06 Nov 2021 10:12:25 -0700