k8s集群搭建之二:etcd集群的搭建

时间:2023-03-09 08:06:40
k8s集群搭建之二:etcd集群的搭建

一 介绍

     Etcd是一个高可用的 Key/Value 存储系统,主要用于分享配置和服务发现。

     简单:支持 curl 方式的用户 API (HTTP+JSON)
     安全:可选 SSL 客户端证书认证
     快速:单实例可达每秒 1000 次写操作
     可靠:使用 Raft 实现分布式

二 搭建开始

2.1 yum 安装etcd服务 (三台执行)

yum  -y install etcd

[root@k8s-master ~]# etcd -version
etcd Version: 3.3.11
Git SHA: 2cf9e51
Go Version: go1.10.3
Go OS/Arch: linux/amd64

2.2 安装cfssl工具,并配置证书:这里采用的是共用证书的方式(master执行)

mkdir  /etc/etcd/ssl

cd /etc/etcd/ssl

cat  etcd-root-ca-csr.json       #etcd根CA证书

{
"key": {
"algo": "rsa",
"size": 4096
},
"names": [
{
"O": "etcd",
"OU": "etcd Security",
"L": "Beijing",
"ST": "Beijing",
"C": "CN"
}
],
"CN": "etcd-root-ca"
}

cat etcd-gencert.json      #etcd集群证书

{
"signing": {
"default": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "87600h"
}
}
}

cat  etcd-csr.json    #etcd集群证书

{
"key": {
"algo": "rsa",
"size": 4096
},
"names": [
{
"O": "etcd",
"OU": "etcd Security",
"L": "Beijing",
"ST": "Beijing",
"C": "CN"
}
],
"CN": "etcd",
"hosts": [
"127.0.0.1",
"localhost",
"192.168.137.66",
"192.168.137.16",
"192.168.137.26",
"k8s-master",
"k8s-node1",
"k8s-node2" -------->注意最后没有 ,
]
}

下载 cfssl

wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
chmod +x cfssl_linux-amd64 cfssljson_linux-amd64
mv cfssl_linux-amd64 /usr/local/bin/cfssl
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson

生成证书

cfssl gencert --initca=true etcd-root-ca-csr.json | cfssljson --bare etcd-root-ca
cfssl gencert --ca etcd-root-ca.pem --ca-key etcd-root-ca-key.pem --config etcd-gencert.json etcd-csr.json | cfssljson --bare etcd

生成的文件列表如下 tree  .

.
├── etcd.csr
├── etcd-csr.json
├── etcd-gencert.json
├── etcd-key.pem
├── etcd.pem
├── etcd-root-ca.csr
├── etcd-root-ca-csr.json
├── etcd-root-ca-key.pem
├── etcd-root-ca.pem

2.3 分发证书(master执行)

I="192.168.137.16 192.168.137.26"

for IP in $I; do
ssh root@$IP mkdir /etc/etcd/ssl/
scp *.pem root@$IP:/etc/etcd/ssl/
ssh root@$IP chown -R etcd:etcd /etc/etcd/ssl/
ssh root@$IP chmod -R 755 /etc/etcd/
done #本台服务器也要设置权限
cd /etc/etcd/ssl
chown -R etcd:etcd /etc/etcd/ssl
chmod -R 755 /etc/etcd/ssl

2.4 修改配置文件(master执行)

[root@k8s-master ssl]# cat /etc/etcd/etcd.conf |grep -v '^#'
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.137.66:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.137.66:2379,http://127.0.0.1:2379"
ETCD_NAME="etcd01"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.137.66:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.137.66:2379"
ETCD_INITIAL_CLUSTER="etcd01=https://192.168.137.66:2380,etcd02=https://192.168.137.16:2380,etcd03=https://192.168.137.26:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new" #注意状态为new
ETCD_CERT_FILE="/etc/etcd/ssl/etcd.pem"
ETCD_KEY_FILE="/etc/etcd/ssl/etcd-key.pem"
ETCD_CLIENT_CERT_AUTH="True"
ETCD_TRUSTED_CA_FILE="/etc/etcd/ssl/etcd-root-ca.pem"
ETCD_AUTO_TLS="True"
ETCD_PEER_CERT_FILE="/etc/etcd/ssl/etcd.pem"
ETCD_PEER_KEY_FILE="/etc/etcd/ssl/etcd-key.pem"
ETCD_PEER_CLIENT_CERT_AUTH="True"
ETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/ssl/etcd-root-ca.pem"
ETCD_PEER_AUTO_TLS="True"

其他节点:注意上面的蓝色部分是要 修改的(其他节点上执行)

# k8s-node1
ETCD_LISTEN_PEER_URLS="https://192.168.137.16:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.137.16:2379,http://127.0.0.1:2379"
ETCD_NAME="etcd02"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.137.16:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.137.16:2379"
ETCD_INITIAL_CLUSTER_STATE="existing"  #注意这里,不能为new,有的是exist,但是这个版本测试为existing

# k8s-node2
ETCD_LISTEN_PEER_URLS="https://192.168.137.26:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.137.26:2379,http://127.0.0.1:2379"
ETCD_NAME="etcd02"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.137.26:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.137.26:2379"
ETCD_INITIAL_CLUSTER_STATE="existing"

2.5 启动(master先执行,其他两台后执行):master启动时间漫长,说明配置有问题

systemctl  daemon-reload

systemctl  start etcd

systemctl  enable  etcd

2.6 设置 etcdctl 的版本,有 v2和v3版本,他们的命令不同,这里采用v3版本

#export ETCDCTL_API=3
#cat /etc/profile
.....
export ETCDCTL_API=3

2.7 验证节点状态

etcdctl --cacert=/etc/etcd/ssl/etcd-root-ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem   \
--endpoints=https://192.168.137.66:2379,https://192.168.137.16:2379,https://192.168.137.26:2379 endpoint health
[root@k8s-master ssl]# etcdctl --cacert=/etc/etcd/ssl/etcd-root-ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints=https://192.168.137.66:2379,https://192.168.137.16:2379,https://192.168.137.26:2379 endpoint health
https://192.168.137.16:2379 is healthy: successfully committed proposal: took = 4.807966ms
https://192.168.137.66:2379 is healthy: successfully committed proposal: took = 3.790949ms
https://192.168.137.66:2379 is healthy: successfully committed proposal: took = 2.493048ms

2.8 版本为2时对etcd的检查

export ETCDCTL_API=2

[root@k8s-master ssl]# etcdctl --ca-file=/etc/etcd/ssl/etcd-root-ca.pem --cert-file=/etc/etcd/ssl/etcd.pem --key-file=/etc/etcd/ssl/etcd-key.pem  \
--endpoints=https://192.168.137.66:2379,https://192.168.137.16:2379 cluster-health
member 457528f516aae01a is healthy: got healthy result from https://192.168.137.66:2379
member b13478c4279881c2 is healthy: got healthy result from https://192.168.137.16:2379
cluster is healthy

三 报错以及解决

error 1:执行etcdctl命令检查时报错

etcdctl --cacert=/etc/etcd/ssl/etcd-root-ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem   \
--endpoints=https://192.168.137.66:2379,https://192.168.137.16:2379,https://192.168.137.26:2379 endpoint health
........
flag provided but not defined: -cacert

solution:版本不同,命令的格式不同

export ETCDCTL_API=3

error 2: failed to check the health of member 6c70a880257288f on https://192.168.137.16:2379: Get https://192.168.137.16:2379/health: remote error: tls: bad certificate

solution:证书问题

重做步骤2.2和2.3

error 3: couldn't find local name "etcd04" in the initial cluster configuration

solution:配置文件问题

检查步骤2.4,着重看以下方面
ETCD_NAME="etcd01"
ETCD_INITIAL_CLUSTER="etcd01=https://192.168.137.66:2380,etcd02=https://192.168.137.16:2380,etcd03=https://192.168.137.26:2380" 接着执行下面的命令
systemctl stop etcd
rm -rf /var/lib/etcd/default.etcd
systemctl daemon-reload && systemctl restart etcd

error 4:错误如下

[root@k8s-node1 ~]# etcdctl member list
Error: dial tcp 127.0.0.1:2379: connect: connection refused

solution:根据步骤2.4修改

ETCD_LISTEN_CLIENT_URLS="https://192.168.137.16:2379,http://127.0.0.1:2379"

[root@k8s-node1 ~]# etcdctl member list
457528f516aae01a, started, etcd01, https://192.168.137.66:2380, https://192.168.137.66:2379
b13478c4279881c2, started, etcd02, https://192.168.137.16:2380, https://192.168.137.16:2379

a93278c4200188c5, started, etcd03, https://192.168.137.26:2380, https://192.168.137.26:2379