1. 前言
负载均衡LB,高可用HA,这一小结主要讲双机热备方案保证高可用。这里选择Keepalived作为双机热备方案,下面就对具体的配置进行了解。
2. 下载Keepalived
wget http://www.keepalived.org/software/keepalived-1.4.0.tar.gz
文档 http://www.keepalived.org/doc
参考 https://www.cnblogs.com/abclife/p/7909818.html
https://www.cnblogs.com/kevingrace/p/6138185.html
系统 Debian 8
./configure --prefix=/opt/keepalive #这一步,可能要额外安装一些依赖
make
make install
mkdir /etc/keepalived
cp ./etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf
cp ./sbin/keepalived /usr/sbin/
vim /etc/init.d/keepalived
chmod a+x /etc/init.d/keepalived
service keepalived start
/etc/init.d/keepalived 加入如下内容
#!/bin/sh
#
# keepalived High Availability monitor built upon LVS and VRRP
#
# chkconfig: -
# description: Robust keepalive facility to the Linux Virtual Server project \
# with multilayer TCP/IP stack checks. ### BEGIN INIT INFO
# Provides: keepalived
# Required-Start: $local_fs $network $named $syslog
# Required-Stop: $local_fs $network $named $syslog
# Should-Start: smtpdaemon httpd
# Should-Stop: smtpdaemon httpd
# Default-Start:
# Default-Stop:
# Short-Description: High Availability monitor built upon LVS and VRRP
# Description: Robust keepalive facility to the Linux Virtual Server
# project with multilayer TCP/IP stack checks.
### END INIT INFO # Source function library.
. /etc/rc.d/init.d/functions exec="/usr/sbin/keepalived"
prog="keepalived"
config="/etc/keepalived/keepalived.conf" [ -e /etc/sysconfig/$prog ] && . /etc/sysconfig/$prog lockfile=/var/lock/subsys/keepalived start() {
[ -x $exec ] || exit
[ -e $config ] || exit
echo -n $"Starting $prog: "
daemon $exec $KEEPALIVED_OPTIONS
retval=$?
echo
[ $retval -eq ] && touch $lockfile
return $retval
} stop() {
echo -n $"Stopping $prog: "
killproc $prog
retval=$?
echo
[ $retval -eq ] && rm -f $lockfile
return $retval
} restart() {
stop
start
} reload() {
echo -n $"Reloading $prog: "
killproc $prog -
retval=$?
echo
return $retval
} force_reload() {
restart
} rh_status() {
status $prog
} rh_status_q() {
rh_status &>/dev/null
} case "$1" in
start)
rh_status_q && exit
$
;;
stop)
rh_status_q || exit
$
;;
restart)
$
;;
reload)
rh_status_q || exit
$
;;
force-reload)
force_reload
;;
status)
rh_status
;;
condrestart|try-restart)
rh_status_q || exit
restart
;;
*)
echo $"Usage: $0 {start|stop|status|restart|condrestart|try-restart|reload|force-reload}"
exit
esac
exit $?
3. 双机热备(主从模式)
修改配置文件 keepalived.conf
vim /etc/keeplived/keeplived.conf
global_defs {
notification_email { #指定Keepalived在发生事情的时候,发送邮件通知,每行一个地址
yyyyy@qq.com
}
notification_email_from yyyyy@qq.com #指定发件人
smtp_server 192.168.8.208 #发送email的smtp地址
smtp_connect_timeout #超时时间
router_id nginx_dev #运行Keepalived的机器标识号,可以相同也可以不同
} vrrp_instance nginx_dev {
state MASTER
interface eth1
#mcast_src_ip 172.16.23.203
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
172.16.23.222
}
}
然后 service keepalived restart
然后就可以通过 172.16.23.222 访问当前主机了。如果访问不了的要判断是否所处于同个网络中,即可以通过路由访问到的。这个Keepalived就是通过发广播包这里的通知内网机器实现的。
同理在另外一台电脑上配置上面信息 把state MASTER 改为 state BACKUP
4. 双机热备(主主模式)
主从模式的话,由于一般情况下是不会出现宕机,所以往往会有一台机器浪费,这样是对机器的浪费,所以现在双机热备主主模式是比较推荐的。所谓的主主模式,就是建立两个实例,互为主从而已。
172.16.23.203 配置如下
global_defs {
notification_email { #指定Keepalived在发生事情的时候,发送邮件通知,每行一个地址
xxx@aa.com
}
notification_email_from xxx@aa.com #指定发件人
smtp_server 192.168.8.208 #发送email的smtp地址
smtp_connect_timeout #超时时间
router_id nginx_dev_1 #运行Keepalived的机器标识号,可以相同也可以不同
router_id nginx_dev_2
} vrrp_instance nginx_dev_1 {
state MASTER
interface eth1
mcast_src_ip 172.16.23.203
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
172.16.23.222
}
} vrrp_instance nginx_dev_2 {
state BACKUP
interface eth1
mcast_src_ip 172.16.23.203
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
172.16.23.223
}
}
172.16.23.204 配置如下
global_defs {
notification_email { #指定Keepalived在发生事情的时候,发送邮件通知,每行一个地址
xxxx@aa.com
}
notification_email_from xxxx@aa.com #指定发件人
smtp_server 192.168.8.208 #发送email的smtp地址
smtp_connect_timeout #超时时间
router_id nginx_dev_1 #运行Keepalived的机器标识号,可以相同也可以不同
router_id nginx_dev_2
} vrrp_instance nginx_dev_2 {
state MASTER
interface eth1
mcast_src_ip 172.16.23.204
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
172.16.23.223
}
} vrrp_instance nginx_dev_1 {
state BACKUP
interface eth1
mcast_src_ip 172.16.23.204
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
172.16.23.222
}
}
就是这两份配置,基本是一样的,互为主备,里面一个比较重要修改的是virtual_router_id
上面实现的效果是访问172.16.23.222 时先转到172.16.20.203主机上,访问172.16.23.223时转到172.16.20.204主机上。当其中一台主机宕机时,就会自动切换,切换到好的主机上,这个过程就几秒的时间。
这里是需要两个IP地址,需要客户端进行负载选择,这一步可以通过DNS进行分发处理。
root@debian-t6:/usr/local/nginx/html# ip addr
: lo: <LOOPBACK,UP,LOWER_UP> mtu qdisc noqueue state UNKNOWN group default
link/loopback ::::: brd :::::
inet 127.0.0.1/ scope host lo
valid_lft forever preferred_lft forever
: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu qdisc pfifo_fast state UP group default
link/ether fc:aa::9d:d9:5a brd ff:ff:ff:ff:ff:ff
inet 172.16.23.204/ brd 172.16.23.255 scope global eth1
valid_lft forever preferred_lft forever
inet 172.16.23.223/ scope global eth1
valid_lft forever preferred_lft forever
inet 172.16.23.222/ scope global eth1
valid_lft forever preferred_lft forever
日志在 /var/log/message
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_1) Sending/queueing gratuitous ARPs on eth1 for 172.16.23.222
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.222
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.222
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.222
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.222
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_1) sent priority
Jan :: debian-t6 Keepalived_healthcheckers[]: Stopped
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_1) removing protocol VIPs.
Jan :: debian-t6 Keepalived_vrrp[]: Stopped
Jan :: debian-t6 Keepalived_healthcheckers[]: Opening file '/etc/keepalived/keepalived.conf'.
Jan :: debian-t6 Keepalived_vrrp[]: Registering Kernel netlink reflector
Jan :: debian-t6 Keepalived_vrrp[]: Registering Kernel netlink command channel
Jan :: debian-t6 Keepalived_vrrp[]: Registering gratuitous ARP shared channel
Jan :: debian-t6 Keepalived_vrrp[]: Opening file '/etc/keepalived/keepalived.conf'.
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_2) removing protocol VIPs.
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_1) removing protocol VIPs.
Jan :: debian-t6 Keepalived_vrrp[]: Using LinkWatch kernel netlink reflector...
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_1) Entering BACKUP STATE
Jan :: debian-t6 Keepalived_vrrp[]: VRRP sockpool: [ifindex(), proto(), unicast(), fd(,)]
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_1) forcing a new MASTER election
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_2) Transition to MASTER STATE
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_2) Received advert with higher priority , ours
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_2) Entering BACKUP STATE
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_2) Transition to MASTER STATE
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_1) Transition to MASTER STATE
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_2) Entering MASTER STATE
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_2) setting protocol VIPs.
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.223
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_2) Sending/queueing gratuitous ARPs on eth1 for 172.16.23.223
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.223
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.223
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.223
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.223
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_1) Entering MASTER STATE
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_1) setting protocol VIPs.
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.222
Jan :: debian-t6 Keepalived_vrrp[]: VRRP_Instance(nginx_dev_1) Sending/queueing gratuitous ARPs on eth1 for 172.16.23.222
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.222
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.222
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.222
Jan :: debian-t6 Keepalived_vrrp[]: Sending gratuitous ARP on eth1 for 172.16.23.222
5. KeepAlived其他配置
下面讲解一些其他配置和高级应用
(1)email通知,这里就不做了,建议用第三方独立监控服务,如Nagios、Zabbix进行监控
(2)router_id 用户标识本节点名称,通常为hostname
(3)vrrp_instance 实例定义
(4)state 实例状态,只有MASTER 和 BACKUP两种状态,并且需要全部大写。抢占模式下其中MASTER为工作状态,BACKUP为备用状态。当MASTER所在服务器失效时,BACKUP所在服务器会自动把它的状态由BACKUP切换到MASTER状态。当失效的MASTER所在的服务恢复时,BACKUP从MASTER恢复到BACKUP状态
(5)interface 对望提供服务网卡接口,即VIP绑定的网卡接口。一些服务器都有两个以上的网卡接口,在选择接口时要正确选择
(6)mcast_src_ip 本机IP地址
(7)virtual_router_id 虚拟路由的ID号,每个节点必须设置一样。相同的VRID为一个组,将决定多播的MAC地址,不同的实例节点必须不一样
(8)priority 节点优先级,取值0~254, MASTER 要比BACKUP高
(9)advert_int MASTER 与 BACKUP 节点间同步检查的时间间隔,单位秒
(10)authentication 验证类型和验证密码。主要用PASS 密码模式 {auth_type PASS auth_pass 123456} 同一个vrrp实例MASTER与BACKUP使用相同的密码才能正常通信
(11)nopreempt 禁止抢占服务。默认情况,当MASTER服务挂掉之后,BACKUP自动升级为MASTER并接替其任务,当MASTER服务恢复后,升级为MASTER的BACKUP服务又自动降为BACKUP,把工作权交给原MASTER。当配置了nopreempt,MASTER从挂到到恢复,不再将服务抢占过来
(12)virtual_ipaddress 虚拟IP地址池,可以有多个IP,每个IP占一行,不需要指定子网掩码。注意这里的IP必须与我们设定的VIP保持一致 VRRP HA虚拟地址
(13)notify_master 表示当切换到master状态时,要执行的脚本
(14)notify_backup 表示当切换到backup状态时,要执行的脚本
(15)notify_fault 表示切换出现故障是要执行的脚本(这里也可以发送邮件什么的)
(16)track_script 执行监控的服务
(17)vrrp_script VRRP 脚本检测
6. KeepAlived 高级应用
172.16.23.203
/etc/keepalived/keepalived.conf
global_defs {
router_id nginx_dev_1
} #用于监控Nginx、Redis等应用是否在运行
vrrp_script chk_nginx_port {
script "/etc/keepalived/check.sh" #通过脚本检测,根据返回值进行判断
interval #脚本执行间隔 每2秒执行一次
weight - #检测失败,优先级变更
fall #连续检测2次失败才算失败
rise #检测一次成功算成功,但不修改优先级
} vrrp_instance nginx_dev_1 {
state MASTER
interface eth1
mcast_src_ip 172.16.23.203
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress { #VRRP HA 虚拟地址,可以写多个
172.16.23.222
} notify_master "/etc/keepalived/run.sh master1"
notify_backup "/etc/keepalived/run.sh backup1"
notify_fault "/etc/keepalived/run.sh fault1" track_script {
chk_nginx_port #执行对于脚本
}
} vrrp_instance nginx_dev_2 {
state BACKUP
interface eth1
mcast_src_ip 172.16.23.203
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
172.16.23.223 #VRRP HA虚拟地址
} notify_master "/etc/keepalived/run.sh master2"
notify_backup "/etc/keepalived/run.sh backup2"
notify_fault "/etc/keepalived/run.sh fault2"
}
/etc/keepalived/check.sh
#!/bin/bash
counter=$(ps -C nginx --no-heading | wc -l)
if [ "${counter}" = "" ]; then
service nginx start
#sleep # 这个在执行过程中有问题
counter=$(ps -C nginx --no-heading | wc -l)
if [ "${counter}" = "" ]; then
service keepalived stop
fi
fi
/etc/keepalived/run.sh
#!/bin/sh
echo $(date +%H:%M:%S) $ >> /etc/keepalived/time.txt
172.16.23.204
/etc/keepalived/keepalived.conf
global_defs {
router_id nginx_dev_2 #运行Keepalived的机器标识号,可以相同也可以不同
} #用于监控Nginx、Redis等应用是否在运行
vrrp_script chk_nginx_port {
script "/etc/keepalived/check.sh"
interval
weight -
fall
rise
} vrrp_instance nginx_dev_2 {
state MASTER
interface eth1
mcast_src_ip 172.16.23.204
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
172.16.23.223
} notify_master "/etc/keepalived/run.sh master1"
notify_backup "/etc/keepalived/run.sh backup1"
notify_fault "/etc/keepalived/run.sh fault1" track_script {
chk_nginx_port
}
} vrrp_instance nginx_dev_1 {
state BACKUP
interface eth1
mcast_src_ip 172.16.23.204
virtual_router_id
priority
advert_int
authentication {
auth_type PASS
auth_pass
}
virtual_ipaddress {
172.16.23.222
} notify_master "/etc/keepalived/run.sh master2"
notify_backup "/etc/keepalived/run.sh backup2"
notify_fault "/etc/keepalived/run.sh fault2" track_script {
chk_nginx_port
}
}
/etc/keepalived/check.sh
#!/bin/bash
counter=$(ps -C nginx --no-heading | wc -l)
if [ "${counter}" = "" ]; then
service nginx start
counter=$(ps -C nginx --no-heading | wc -l)
if [ "${counter}" = "" ]; then
service keepalived stop
fi
fi
/etc/keepalived/run.sh
#!/bin/sh
echo $(date +%H:%M:%S) $ >> /etc/keepalived/time.txt
上面的那些sh文件要 chmod a+x *.sh ,在keepalived.conf 文件中最好是使用绝对路径
测试过程:
先启动203 的KeepAlived和Nginx
然后启动204的KeepAlived和Nginx
用浏览器访问 172.16.23.223 / 172.16.23.222 都是没有问题,基本被负载到两台主机上去了
场景一
然后204机器 service nginx stop && service nginx status 模拟异常退出并查看状态
过几秒后再查看204 service nginx status 发现nginx 自动重启了
场景二
修改check.sh 文件 把service nginx start 这一行注释掉
然后204 机器 service nginx stop && service nginx status 模拟异常退出并查看状态
过几秒后再查看204 service nginx status / service keepalived status 发现nginx关闭、对于的204上的KeepAlived也被关闭
这个时候,203服务器 tail -f time.txt 会出现一条master切换信息 表示VIP进行切换
浏览器访问 172.16.23.223/172.16.23.222 访问正常,不过都是指向同一个机器
取消check.sh 文件的注释
启动204的 KeepAlived service keepalived start 由于取消注释,对于的nginx也自动启动了
这时,203服务器 tail -f time.txt 会出现一条backup切换信息,表示203主机切换为Backup状态了。
这时浏览器访问以上两个IP,同样被负载的两台主机上