Kubernetes 1.16.0 Fast Upgrade

Keywords: Kubernetes Docker kubelet sudo

Kubernetes 1.16.0 has been officially released. Quick upgrade (including domestic mirror quick download links) includes three main steps: updating kubeadm/kubectl/kubelet version, pulling mirror and upgrading Kubernetes cluster. Reference Locked version of software on Ubuntu is not updated > Install a specific DockerCE version.

1. Upgrade the kubeadm/kubectl/kubelet version

sudo apt install kubeadm=1.16.0-00 kubectl=1.16.0-00 kubelet=1.16.0-00

View the container mirror version of this version:

kubeadm config images list

The output is as follows:

~# kubeadm config images list

k8s.gcr.io/kube-apiserver:v1.16.0
k8s.gcr.io/kube-controller-manager:v1.16.0
k8s.gcr.io/kube-scheduler:v1.16.0
k8s.gcr.io/kube-proxy:v1.16.0
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.3.10
k8s.gcr.io/coredns:1.3.1

2. Pull the container mirror

The original kubernetes image file is on gcr and cannot be downloaded directly. I mirrored it in the container warehouse of Aliyun's Hangzhou computer room. It was still relatively fast to pull it out.

echo ""
echo "=========================================================="
echo "Pull Kubernetes v1.16.0 Images from aliyuncs.com ......"
echo "=========================================================="
echo ""

MY_REGISTRY=registry.cn-hangzhou.aliyuncs.com/openthings

## Pull mirror image
docker pull ${MY_REGISTRY}/k8s-gcr-io-kube-apiserver:v1.16.0
docker pull ${MY_REGISTRY}/k8s-gcr-io-kube-controller-manager:v1.16.0
docker pull ${MY_REGISTRY}/k8s-gcr-io-kube-scheduler:v1.16.0
docker pull ${MY_REGISTRY}/k8s-gcr-io-kube-proxy:v1.16.0
docker pull ${MY_REGISTRY}/k8s-gcr-io-etcd:3.3.10
docker pull ${MY_REGISTRY}/k8s-gcr-io-pause:3.1
docker pull ${MY_REGISTRY}/k8s-gcr-io-coredns:1.3.1

## Add Tag
docker tag ${MY_REGISTRY}/k8s-gcr-io-kube-apiserver:v1.16.0 k8s.gcr.io/kube-apiserver:v1.16.0
docker tag ${MY_REGISTRY}/k8s-gcr-io-kube-scheduler:v1.16.0 k8s.gcr.io/kube-scheduler:v1.16.0
docker tag ${MY_REGISTRY}/k8s-gcr-io-kube-controller-manager:v1.16.0 k8s.gcr.io/kube-controller-manager:v1.16.0
docker tag ${MY_REGISTRY}/k8s-gcr-io-kube-proxy:v1.16.0 k8s.gcr.io/kube-proxy:v1.16.0
docker tag ${MY_REGISTRY}/k8s-gcr-io-etcd:3.3.10 k8s.gcr.io/etcd:3.3.10
docker tag ${MY_REGISTRY}/k8s-gcr-io-pause:3.1 k8s.gcr.io/pause:3.1
docker tag ${MY_REGISTRY}/k8s-gcr-io-coredns:1.3.1 k8s.gcr.io/coredns:1.3.1

echo ""
echo "=========================================================="
echo "Pull Kubernetes v1.16.0 Images FINISHED."
echo "into registry.cn-hangzhou.aliyuncs.com/openthings, "
echo "           by openthings@https://my.oschina.net/u/2306127."
echo "=========================================================="

echo ""

Save it as a shell script and execute it.

3. Upgrade Kubernetes Cluster

New installation:

#Specify IP address, version 1.16.0:
sudo kubeadm init --kubernetes-version=v1.16.0 --apiserver-advertise-address=10.1.1.199 --pod-network-cidr=10.244.0.0/16

#Note that CoreDNS is built-in and no longer requires parameters -- feature-gates CoreDNS=true

Let's first look at the versions of the components that need to be upgraded.

Using kubeadm upgrade plan, the output version upgrade information is as follows:

COMPONENT            CURRENT   AVAILABLE
API Server           v1.15.2   v1.16.0
Controller Manager   v1.15.2   v1.16.0
Scheduler            v1.15.2   v1.16.0
Kube Proxy           v1.15.2   v1.16.0
CoreDNS              1.3.1     1.3.1
Etcd                 3.3.10    3.3.10

Make sure that the container image above has been downloaded (if it is not downloaded in advance, it may be blocked by the network and suspended), and then perform the upgrade:

kubeadm upgrade -y apply v1.16.0

See the following information, OK.

[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.16.0". Enjoy!

Then, configure the current user environment:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You can use kubectl version to view status and kubectl cluster-info to view service addresses.

4. Work Node Upgrade

Each working node needs to pull the image of the corresponding version above and install the corresponding version of kubelet.

Check version:

~$ kubectl version

View Pod information:

kubectl get pod --all-namespaces

Done.

5. Upgrade of HA cluster

Upgraded from the previous version 1.13.x, the upgrade failed because the api changed (the apiserver could not be started after the kubelet rose to 1.14), resulting in the new kubeadm accessing the previous apiserver errors. After pulling the image down, you can manually switch the version of the image (all the files under / etc/kubernetes/manifests need to be modified).

For each node, perform the following steps:

  • cd /etc/kubernetes/manifests/.
  • Change all *. yaml to specify that the image version is 1.16.0.

After the 1.14.0 upgrade, problems occurred (1.14.1 still exists):

  • The work node failed to join to cluster, see [kubeadm]. https://github.com/kubernetes/kubernetes/issues/76013
  • According to some community members'tests, the newly installed 1.14 cluster can run normally.
  • My cluster was upgraded from 1.13.4, and after testing version 1.14.1, the problem still exists.
  • The version of kube-proxy requires administration tools to modify the image version number of DaemonSet to 1.14.1.
  • The version of coredns requires management tools to modify the image version number of the replication set to 1.3.1.
    • Run the installation of flannel again, it doesn't work.
    • However, the cluster cannot be restarted after modification. Go in and see that the pod status is Crash.
    • Force deletion of the Pod run instance of CoreDNS. Kubernetes automatically starts new instances.
  • The original installed jupyterhub can't get up. Go in and see the hub pod status as Crash.
    • Look at the hub log to show that there was an error in SQLlite access, remove it from the host storage directory, and fail to access the hub service.
    • When hub pod is deleted, proxy-public of service cannot be connected.
    • Force the deletion of JupyterHub's hub and Proxy's od runtime instances.
    • The Pod running instance of CoreDNS is forcibly deleted, and Kubernetes automatically starts the new instance and restores the running.
    • Sometimes it's a glusterfs setting permission problem, setfacl/getfacl setting.
    • Further examination revealed that the volume writing problem of GlusterFS might be caused by asynchronism.
      • jupyterhub.sqllite write temporary file exists in hub-db-dir directory, which results in lock, not glusterfs write permission problem.
      • Set gluster volume heal vol01 enable to synchronize its data.
      • Restart volume or glusterd services.
      • Or, delete the jupyterhub.sqllite file in the hub-db-dir directory under all gluster storage nodes, and then delete the hub pod to automatically rebuild the file.
      • Generally, the above steps can be restored.

Other:

  • When the whole cluster fails to access, kubectl get node fails, and apiserver access fails when kubectl version occurs.
  • Looking at one of the node routes, the mysterious podsxx 255.255.255.255 route record appears again, and the route del deletion record fails.
  • After running sudo netplan apply, the routing record disappears and the nodes recover to be accessible.

Posted by rUmX on Sun, 22 Sep 2019 23:51:50 -0700