Nginx入门篇(七)之Nginx+keepalived高可用集群

时间:2023-03-08 17:01:35
Nginx入门篇(七)之Nginx+keepalived高可用集群
  • 一、keepalived介绍

  keepalived软件最开始是转为负载均衡软件LVS而设计,用来管理和监控LVS集群系统中各个服务节点的状态,后来又加入了可实现高可用的VRRP功能。所以Keepalived除了能管理LVS以外,还可以作为其他服务(如:Nginx、Haproxy、MySQL)的高可用解决方案的软件。Keepalived是类似工作在lay3、lay4和lay7的交换机制的软件。

  Keepalived软件是通过VRRP协议实现高可用功能。VRRP(虚拟路由器冗余协议)目的就是为了解决静态路由单点故障的问题,它能够保证当个别节点宕机时,整个网络还可以正常地运行。所以Keepalived一方面有配置管理LVS的功能,还可以对LVS下面的节点进行健康监测,另一方面又可以实现系统网络服务的高可用功能。

  • 二、Keepalived的三个功能

1、管理LVS负载均衡软件

2、实现对LVS集群节点健康检查功能

3、作为系统网络服务的高可用功能(重点)

  Keepalived的作用是检测Web服务器的状态,如果有1台web或MySQL服务器宕机或故障,Keepalived检测到后,会将故障的Web服务器或MySQL服务器从集群当中剔除,而当服务器恢复正常后,Keepalived会自动将剔除的服务器重新加入到集群当中,这些工作无需人工参与,需要人工参与的是服务器故障的修复。

  • 三、Keepalived的工作原理  

  Keepalived高可用之间是通过VRRP进行通信的。那什么是VRRP协议呢?

  (1)VRRP,全称Virtual Router Redundancy Protocol,中文为虚拟路由冗余协议,VRRP的出现是为了解决静态路由的单点故障。

  (2)VRRP是通过一种竞选协议机制来决定将路由任务交给某台VRRP路由器的。

  (3)VRRP用IP多播的方式(默认多播地址:224.0.0.18)实现高可用对之通信。

  (4)工作做时,主节点发包,备用节点接包,当备用节点接收不到主节点发送的数据包时,就会启动接管程序接管主节点的资源。备用节点可以有多个,通过优先级竞选,但一般的Keepalived系统运行工作中都是一对。

  (5)VRRP使用了加密协议加密数据,但是目录官方还是推荐以明文的方式配置认证类型和密码。

  明确了VRRP协议,再看Keepalived工作原理:

  Keepalived高可用对之间是通过VRRP进行通信,VRRP通过竞选机制来确定主备,主的优先级高于备,因此工作时,主会优先获得所有资源,备节点处于等待状态,当主宕机后,备用节点则会接管主节点资源,然后顶替主节点对外提供服务。

  在Keepalived服务对之间,只有作为主的服务器会一直发送VRRP广播包,告诉备用节点主节点还活着,此时备用节点不会抢占主,当主不可用时,即备监听不到主发送的广播包时,就会启动相关的服务接管资源,保证业务的连续性,接管速度最快可以小于1秒。

  • 四、Keepalived高可用服务部署

1、环境说明

Hostname IP 角色说明
lb01 192.168.56.12 keepalived  MASTER
lb02 192.168.56.13 keepalived  BACKUP

2、部署Keepalived

(1)安装keepalived

[root@lb01 ~]# yum install -y keepalived
[root@lb02 ~]# yum install -y keepalived
[root@lb01 ~]# rpm -qa keepalived
keepalived-1.3.-.el7.x86_64
[root@lb02 ~]# rpm -qa keepalived
keepalived-1.3.-.el7.x86_64

(2)keepalived.conf配置文件高可用部分解析

[root@lb01 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived global_defs { #定义服务故障报警的E-mail地址,可配多个地址,可选配置
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc #指定发送邮件的发件人,即发送人地址,可选配置
smtp_server 192.168.200.1 #指定发送邮件的smtp服务器,本机开启了sendmail或postfix就可以使用上面的默认地址发送邮件,可选配置
smtp_connect_timeout #链接smtp超时时间,可选配置
router_id LVS_DEVEL #Keepalived服务器的路由标识,在同一局域网内该标识具有唯一性
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval
vrrp_gna_interval
} vrrp_instance VI_1 { #VRRP实例定义区块,定义了一个VI_1的实例,每个vrrp_instance实例可以认为是Keepalived服务的一个实例或作为一个业务服务,在主节点中有的vrrp_instance实例,备用节点也要存在,这样故障才能接管。
state MASTER #定义Keepalived的主备状态,只能有MASTER和BACKUP两种状态,并且状态字符要大写
interface eth0 #定义Keepalived使用的网卡接口
virtual_router_id #虚拟路由ID标识,这个标识最好是一个数字,并且唯一。MASTER和BACKUP配置中相同实例的这个id必须一致,否则会脑裂。
priority #优先级配置,数值越大,实例优先级越高,建议MASTER和BACKUP相差50以上为佳。
advert_int #同步通知间隔,也就是MASTER和BACKUP之间通信检查时间间隔,单位为秒,默认为1.
authentication { #权限认证配置
auth_type PASS #认证类型有PASS、AH2中,官方推荐PASS,不超过8个字符,同一实例MASTER和BACKUP使用相同密码才能正常通信。
auth_pass #认证密码
}
virtual_ipaddress { #虚拟IP地址,可以配置多个IP地址,每个一行,配置时最好明确指定子网掩码和虚拟IP绑定的网络接口。
192.168.200.16
192.168.200.17
192.168.200.18
}
}

3、Keepalived高可用服务单实例演示

(1)配置Keepalived主服务器lb01 MASTER

[root@lb01 keepalived]# cp keepalived.conf keepalived.conf.bak
[root@lb01 keepalived]# vim keepalived.conf
! Configuration File for keepalived global_defs {
notification_email {
@qq.com
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout
router_id lb01
} vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
192.168.56.20/ dev eth0 label eth0:
}
}
[root@lb01 keepalived]# systemctl start keepalived //配置完成启动keepalived
[root@lb01 keepalived]# ip addr |grep 192.168.56.20 //查看是否有配置的虚拟IP:192.168.56.20
inet 192.168.56.20/ scope global secondary eth0:

(2)配置Keepalived备服务器lb02 BACKUP

[root@lb02 keepalived]# cp keepalived.conf keepalived.conf.bak
[root@lb02 keepalived]# vim keepalived.conf
! Configuration File for keepalived global_defs {
notification_email {
@qq.com
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout
router_id lb02
} vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
192.168.56.20/ dev eth0 label eth0:
}
}
[root@lb02 keepalived]# systemctl start keepalived //配置完成启动keepalived
[root@lb02 keepalived]# ip addr |grep 192.168.56.20 //查看是否有配置的虚拟IP:192.168.56.20,备用服务器查看是不存在虚拟IP的,如果有返回结果,说明脑裂了

(3)高可用主备服务器切换测试

()停止主上的keepalived服务,查看lb01和lb02的虚拟ip
[root@lb01 keepalived]# systemctl stop keepalived     //停止主上的keepalived服务
[root@lb01 keepalived]# ip addr |grep 192.168.56.20   //lb01上停止keepalived后,查看lb01上是不存在虚拟ip:192.168.56.20
[root@lb02 keepalived]# ip addr |grep 192.168.56.20   //lb02上可以看到虚拟ip:192.168.56.20,实现了VIP漂移
inet 192.168.56.20/ scope global secondary eth0: ()重新启动主上的keepalived服务,查看lb01和lb02的虚拟ip
[root@lb01 keepalived]# systemctl start keepalived     //重新启动lb01上的keepalived
[root@lb01 keepalived]# ip addr |grep 192.168.56.20    //可以看到虚拟ip又重新回到了lb01上
inet 192.168.56.20/ scope global secondary eth0:
[root@lb02 keepalived]# ip addr |grep 192.168.56.20    //lb02上再查询虚拟ip信息是不存在虚拟ip的

 4、Keepalived双实例双主模式演示

(1)修改lb01和lb02的主配置文件,增加一个实例vrrp_VI2

[root@lb01 keepalived]# cat keepalived.conf
! Configuration File for keepalived global_defs {
notification_email {
@qq.com
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout
router_id lb01
} vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
192.168.56.20/ dev eth0 label eth0:
}
} vrrp_instance VI_2 { //增加一个vrrp实例VI2
state BACKUP
interface eth0
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
192.168.56.30/ dev eth0 label eth0: //虚拟ip为192.168.56.30
}
} [root@lb02 keepalived]# cat keepalived.conf
! Configuration File for keepalived global_defs {
notification_email {
@qq.com
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout
router_id lb02
} vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
192.168.56.20/ dev eth0 label eth0:
}
} vrrp_instance VI_2 { //增加一个vrrp实例VI2
state MASTER
interface eth0
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
192.168.56.30/ dev eth0 label eth0: //虚拟ip为192.168.56.30
}
}

(2)在lb01和lb02上分别重启Keepalived服务,观察初始VIP设置情况

[root@lb01 keepalived]# systemctl restart keepalived
[root@lb01 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"
inet 192.168.56.20/ scope global secondary eth0: [root@lb02 keepalived]# systemctl restart keepalived
[root@lb02 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"
inet 192.168.56.30/ scope global secondary eth0:

启动lb01的Keepalived服务后,初始状态启动了192.168.56.20这个VIP地址,即由VI_1实例配置的VIP对外提供服务。

启动lb02的Keepalived服务后,初始状态启动了192.168.56.30这个VIP地址,即由VI_2实例配置的VIP对外提供服务。

(3)高可用故障切换测试

[root@lb01 keepalived]# systemctl stop keepalived    //停止lb01的keepalived服务
[root@lb01 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30" //在lb01上是无法查看到vip
[root@lb02 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30" //在lb02上是可以查看到2个vip地址的
inet 192.168.56.30/ scope global secondary eth0:
inet 192.168.56.20/ scope global secondary eth0:
[root@lb01 keepalived]# systemctl start keepalived //重新启动lb01上的keepalived服务
[root@lb01 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30" //可以看到vip地址192.168.56.20飘移回来了
inet 192.168.56.20/ scope global secondary eth0: 同理测试停止lb02上的keepalived服务查看vip信息 [root@lb02 keepalived]# systemctl stop keepalived
[root@lb02 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"
[root@lb01 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"
inet 192.168.56.20/ scope global secondary eth0:
inet 192.168.56.30/ scope global secondary eth0:
[root@lb02 keepalived]# systemctl start keepalived
[root@lb02 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"
inet 192.168.56.30/ scope global secondary eth0:
[root@lb01 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"
inet 192.168.56.20/ scope global secondary eth0:
  • 五、Nginx负载均衡配置Keepalived服务

1、环境说明:

Hostname IP 角色说明
lb01 192.168.56.12 Nginx+Keepalived(MASTER)
lb02 192.168.56.13 Nginx+Keepalived(BACKUP)
web01 192.168.56.11 web01服务-->Nginx
web02 192.168.0.130 web02服务-->Nginx

2、配置web01和web02

[root@web01 vhosts]# cat www.abc.org.conf
server {
listen ;
server_name 192.168.56.11;
root /vhosts/html/www;
index index.html index.htm index.php;
}
[root@web02 vhosts]# cat www.abc.org.conf
server {
listen ;
server_name 192.168.0.130;
root /vhosts/html/www;
index index.html index.htm index.php;
} 测试web01和web02的主页,进行区分
[root@localhost vhosts]# curl 192.168.56.11
welcome to 192.168.56.11
[root@localhost vhosts]# curl 192.168.0.130:
welcome to use 192.168.0.130

3、在lb01和lb02上配置Nginx负载均衡

[root@lb01 keepalived]# cat /etc/nginx/nginx.conf
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid; include /usr/share/nginx/modules/*.conf; events {
worker_connections 1024;
} http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048; include /etc/nginx/mime.types;
default_type application/octet-stream; include /etc/nginx/conf.d/*.conf;
upstream web_server_pool {
server 192.168.56.11:80 weight=1;
server 192.168.0.130:8080 weight=1;
}
server {
listen 80;
server_name 192.168.56.20; //此处的server_name需要配置VIP的地址
location / {
proxy_pass http://web_server_pool;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $remote_addr;
}
} }

4、在lb01和lb02上配置Keepalived服务

[root@lb01 keepalived]# cat keepalived.conf
! Configuration File for keepalived global_defs {
notification_email {
@qq.com
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout
router_id lb01
} vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
192.168.56.20/ dev eth0 label eth0:
}
} [root@lb02 keepalived]# cat keepalived.conf
! Configuration File for keepalived global_defs {
notification_email {
@qq.com
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout
router_id lb02
} vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
192.168.56.20/ dev eth0 label eth0:
}
} 注意lb01和lb02中Keepalived配置的不同之处

5、访问测试

直接访问:http://192.168.56.20,可以看到刷新页面,分别得到不同的结果,说明Nginx的负载均衡功能实现了,如图:

Nginx入门篇(七)之Nginx+keepalived高可用集群Nginx入门篇(七)之Nginx+keepalived高可用集群

再停止lb01上的keepalived,再查看是否能够保持访问

[root@lb01 keepalived]# systemctl stop keepalived
[root@lb01 keepalived]# ip addr |grep "192.168.56.20"
[root@lb02 keepalived]# ip addr |grep "192.168.56.20"  //可以看到停止lb01上的keepalived后,vip在lb02上
inet 192.168.56.20/ scope global secondary eth0:

再进行访问:http://192.168.56.20,一样可以保持访问结果,这就实现了Keepalived的高可用功能,如图:

Nginx入门篇(七)之Nginx+keepalived高可用集群Nginx入门篇(七)之Nginx+keepalived高可用集群

6、解决Nginx监控检查的问题

  按照前面的操作,顺利地实现了Nginx的反向代理和负载均衡,也实现了Keepalived的高可用功能,在默认情况下,Keepalived仅仅在对方机器宕机或者Keepalived服务停止时才会接管也业务,而在实际工作当中,也会有其中一台负载均衡器的Nginx宕机了,而Keepalived服务还在运行,这就会导致用户访问的VIP:192.168.56.20无法找到对应的服务。尝试把lb01的Nginx停止,再查看访问情况

()首先先进行访问测试,可以看到都是正常的
[root@localhost vhosts]# curl 192.168.56.20
welcome to 192.168.56.110
[root@localhost vhosts]# curl 192.168.56.20
welcome to use 192.168.0.130 ()停止lb01上的nginx,查看vip依旧还在lb01上
[root@lb01 keepalived]# systemctl stop nginx
[root@lb01 keepalived]# ip addr |grep "192.168.56.20"
inet 192.168.56.20/ scope global secondary eth0: ()再进行测试访问,发现连接被拒绝
[root@localhost vhosts]# curl 192.168.56.20
curl: () Failed connect to 192.168.56.20:; Connection refused

那么,如何解决这种业务服务宕机还可以将IP漂移到备用节点上呢?这就需要Keepalived监测脚本了。首先先写一个脚本,如下:

#!/bin/bash
d=`date --date today +%Y%m%d_%H:%M:%S`
counter=$(ps -C nginx --no-heading |wc -l)
if [ "${counter}" = "" ]; then
systemctl start nginx.service
sleep
counter=$(ps -C nginx --no-heading|wc -l)
if [ "${counter}" = "" ]; then
echo "$d nginx was down.keepalived will stop." >> /var/log/check_ng.log
systemctl stop keepalived
fi
fi

此处在监测到nginx进程为0时,会重新启动nginx,再进行统计nginx的进程数量,如果依旧为0,则将keepalived服务停止,启用高可用故障切换。实验阶段,为了看到效果,使用一下脚本,只要监测到了nginx进程数为0,即刻停止keepalived服务,脚本如下:

此脚本在lb01和lb02上都需要存在的,脚本路径:/etc/keepalived/check_nginx.sh
[root@lb01 keepalived]# cat check_nginx.sh
#!/bin/bash
d=`date --date today +%Y%m%d_%H:%M:%S`
counter=$(ps -C nginx --no-heading|wc -l)
if [ $counter -eq ]; then
echo "$d nginx was down.keepalived will stop." >> /var/log/check_ng.log
systemctl stop keepalived
fi

再对lb01和lb02的keepalived.conf配置文件进行修改,增加脚本模块:

[root@lb01 keepalived]# cat keepalived.conf
! Configuration File for keepalived global_defs {
notification_email {
@qq.com
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout
router_id lb01
} vrrp_script chk_nginx {  #定义vrrp脚本,检测nginx进程,此处一定要注意和"{"的空格,如果没有空格,会导致脚本不会执行,切记切记!!!
script "/etc/keepalived/check_nginx.sh"  #执行脚本,当Nginx服务有问题,就停掉Keepalived
interval 2        #监测的间隔时间为2s
weight
} vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
192.168.56.20/
}
track_script {
chk_nginx    #在vrrp实例VI_1启用chk_nginx这个脚本
}
}

下面测试过程和结果:

(1)在lb01上查看keepalived的vip和进程以及nignx的端口
[root@lb01 keepalived]# !ip
ip addr |grep "192.168.56.20"
inet 192.168.56.20/ scope global secondary eth0
[root@lb01 keepalived]# netstat -tulnp |grep nginx
tcp 0.0.0.0: 0.0.0.0:* LISTEN /nginx: master
[root@lb01 keepalived]# ps -ef |grep keepalived
root : ? :: /usr/sbin/keepalived -D
root : ? :: /usr/sbin/keepalived -D
root : ? :: /usr/sbin/keepalived -D (2)模拟Nginx故障,停止Nginx服务,再查看(1)中的相关信息
[root@lb01 keepalived]# systemctl stop nginx
[root@lb01 keepalived]# !nets
netstat -tulnp |grep nginx
[root@lb01 keepalived]# !ip
ip addr |grep "192.168.56.20"
[root@lb01 keepalived]# ps -ef |grep keepalived
root : pts/ :: grep --color=auto keepalived

(3)在lb02上查看VIP信息是否存在,并验证web服务访问是否正常
[root@lb02 keepalived]# !ip
ip addr |grep "192.168.56.20"
inet 192.168.56.20/ scope global secondary eth0
[root@localhost vhosts]# curl 192.168.56.20
welcome to use 192.168.0.130
[root@localhost vhosts]# curl 192.168.56.20
welcome to 192.168.56.110

通过上述的脚本监测,可以实现了真正的Nginx+Keepalived高可用故障切换功能。

7、写一个监测Keepalived脑裂的脚本

  为了防止高可用功能出现脑裂现象,还可以在备用服务器上写一个监测脚本,如果可以ping通主节点并且备用节点有VIP就报警。

(1)在lb02上写一个监测脚本并执行

[root@lb02 keepalived]# cat check_split_brain.sh
#!/bin/bash
lb01_vip="192.168.56.20"
lb01_ip="192.168.56.12"
while true
do
ping -c -W $lb01_ip &>/dev/null
if [ $? -eq -a `ip add|grep "$lb01_vip"|wc -l` -eq ]
then
echo "ha is split brain.warning."
else
echo "ha is ok."
fi
sleep
done
[root@lb02 keepalived]# sh check_split_brain.sh
ha is ok.
ha is ok.
ha is ok.

正常情况下,主节点还活着,VIP 192.168.56.20就在主节点上,不会报警,提示:ha is ok

(2)模拟脑裂:停止主节点lb01上的Keepalived,查看lb02上的脚本执行情况

[root@lb01 keepalived]# systemctl stop keepalived
[root@lb02 keepalived]# sh check_split_brain.sh
ha is ok.
ha is split brain.warning.
ha is split brain.warning.

从上可以看到脚本会报警有脑裂的错误,即可将此叫脚本放在zabbix监控服务当中,实现脑裂报警。