1 Application scenarios
Scene:
- In actual work, a Worker node may need to be maintained
- Migrate, we need to smoothly stop and start the node
- The impact on clusters and services during start-stop should be minimized
Notice:
- Exclude Worker node operation
- The workload (Pod) on the Worker node will be evicted to other nodes
- Please make sure the cluster resources are sufficient
2 Operation steps
2.1 Stop Worker node scheduling
# View informationroot@sh-gpu091:~# kubectl get node NAME STATUS ROLES AGE VERSION 172.19.13.31 Ready node 403d v1.14.1 Ready node 403d v1.14.1 Ready node 403d v1.14.1 Ready node 403d v1.14.1 Ready node 403d v1.14.1 # Stop Worker node schedulingroot@sh-gpu091:~# kubectl cordon node/ cordoned
- Check the node status
root@sh-gpu091:~# kubectl get node NAME STATUS ROLES AGE VERSION 172.19.13.31 Ready node 403d v1.14.1 Ready node 403d v1.14.1 Ready node 403d v1.14.1 Ready node 403d v1.14.1 Ready,SchedulingDisabled node 403d v1.14.1
2.2 Evicting Workloads on Worker Nodes
# --ignore-daemonsets Ignore daemonset when expelling pods# --delete-local-data Delete the temporary data of the pod when expelling a pod. This parameter will not delete the persistent data.root@sh-gpu091:~# kubectl drain --delete-local-data --ignore-daemonsets --force node/ already cordoned WARNING: ignoring DaemonSet-managed Pods: cattle-system/cattle-node-agent-8wcvs, kube-system/kube-flannel-ds-kqzhc, kube-system/nvidia-device-plugin-daemonset-rr2lf, monitoring/prometheus-node-exporter-xtbxp evicting pod "model-server-0" evicting pod "singleview-proxy-client-pbdownloader-0" evicting pod "singleview-proxy-service-0" pod/singleview-proxy-client-pbdownloader-0 evicted pod/singleview-proxy-service-0 evicted pod/model-server-0 evicted node/ evicted
2.3 Stop Docker, Kubelet and other services
systemctl stop kubelet systemctl stop docker
- Check whether there is still a pod on the node
kubectl get pod -A -o wide |grep
- If recovery is not required, you can delete the node and confirm the node information
root@sh-gpu091:~# kubectl delete node node "" deleted root@sh-gpu091:~# kubectl get node NAME STATUS ROLES AGE VERSION 172.19.13.31 Ready node 403d v1.14.1 Ready node 403d v1.14.1 Ready node 403d v1.14.1 Ready node 403d v1.14.1 root@sh-gpu091:~#
2.4 Recover Worker nodes
systemctl start docker systemctl status docker systemctl start kubelet systemctl status kubelet
2.5 Allow Worker node scheduling
# Cancel the non-scheduledkubectl uncordon
Summarize
The above is personal experience. I hope you can give you a reference and I hope you can support me more.