物理机异常断电,linux虚拟机系统磁盘mount失败,导致无法启动; kubectl 连接失败

时间:2023-12-17 19:34:56

虚拟机 CentOS 7 挂载文件系统失败

上周五下班前没有关闭虚拟机和物理机,

今天周一开了虚拟机之后,发现操作系统启动失败。

原因跟 这篇文章描述的一模一样

解决操作系统的文件系统挂载的问题之后,

kubectl 命令运行失败

kubectl get nodes 等命令全部报错:

The connection to the server 192.168.102.149:6443 was refused - did you specify the right host or port?

运行 ss -tnlnetstat -tnl命令,发现 6443 端口没有被监听。

利用Google查询,发现问题在于 apiserver 启动失败。

docker ps -a | grep k8s_kube-apiserver
docker logs fd6330153fc3

通过以上命令,我发现 apiserver 启动失败的原因是

addrConn.createTransport failed to connect to {127.0.0.1:2379

并且最终 unable to create storage backend

利用 kubeadm 重装 k8s(即 Kubernetes) 集群

尝试了各种办法没有修复 apiserver, 于是决定重装 Kubernetes 集群。

在 Master 上

kubeadm reset
kubeadm init --kubernetes-version=v1.14.2 --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12 --ignore-preflight-errors=Swap
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
rm -fr $HOME/.kube
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

在 node1、node2 上,

kubeadm reset
cat /proc/sys/net/bridge/bridge-nf-call-ip6tables
cat /proc/sys/net/bridge/bridge-nf-call-iptables echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables
echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables kubeadm join 192.168.202.130:6443 --token ozs9n1.xz2k1w58i5ndsaim \
--discovery-token-ca-cert-hash sha256:3ca6e686aaec53d11ae08ac29d7de3bf328fd513847c2ffb0d9f317d36ccde96 --ignore-preflight-errors=Swap

经过以上步骤,终于成功复活 Kubernetes 集群。

kubectl get cs
kubectl get nodes
kubectl get pods
kubectl get pods -n kube-system
kubectl get ns

运行结果如下:

[root@svn ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
[root@svn ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
app.centos7.com Ready <none> 57m v1.14.2
jks.centos7.com Ready <none> 55m v1.14.2
svn.centos7.com Ready master 61m v1.14.2
[root@svn ~]# kubectl get nodes -n kube-system
NAME STATUS ROLES AGE VERSION
app.centos7.com Ready <none> 57m v1.14.2
jks.centos7.com Ready <none> 55m v1.14.2
svn.centos7.com Ready master 61m v1.14.2
[root@svn ~]# kubectl get ns
NAME STATUS AGE
default Active 61m
kube-node-lease Active 61m
kube-public Active 61m
kube-system Active 61m
[root@svn ~]# kubectl get pods
No resources found.
[root@svn ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-2zv9n 1/1 Running 3 62m
coredns-fb8b8dccf-wwmtk 1/1 Running 3 62m
etcd-svn.centos7.com 1/1 Running 1 61m
kube-apiserver-svn.centos7.com 1/1 Running 1 61m
kube-controller-manager-svn.centos7.com 1/1 Running 1 61m
kube-flannel-ds-amd64-989ld 1/1 Running 0 48m
kube-flannel-ds-amd64-bdnkg 1/1 Running 1 48m
kube-flannel-ds-amd64-mndjd 1/1 Running 0 48m
kube-proxy-2s2c9 1/1 Running 0 58m
kube-proxy-5h7gp 1/1 Running 1 62m
kube-proxy-ms7cr 1/1 Running 0 57m
kube-scheduler-svn.centos7.com 1/1 Running 1 61m
[root@svn ~]#

参考资料