Nagios监控系统安装和配置

时间:2021-07-05 17:06:56

Nagios是一款企业级开源免费的监控工具,其关注点在于保证服务的正常运行,并且在服务发生问题时提供报警机制。

1. 实验环境

Nagios服务端:10.20.2.233

Nagios监控端:web1(10.20.2.2.235) 、 web2(10.20.2.236)

2. Nagios服务端部署

1) 安装nagios依赖软件包

通过yum方式快速安装Nagios所需的依赖软件包

yum -y install gd gd-devel openssl openssl-devel httpd php gcc glibc glibc-common make net-snmp wget

2) 创建nagios账户与组

配置时使用--with-nagios-user和--with-nagios-group指定以该账号的身份运行Nagios。

useradd nagios

3) 源码下载地址

Nagios:

http://superb-sea2.dl.sourceforge.net/project/nagios/nagios-4.x/nagios-4.2.1/nagios-4.2.1.tar.gz

Nagios-plugin:

https://nagios-plugins.org/download/nagios-plugins-2.1.2.tar.gz

Nrpe:

http://pilotfiber.dl.sourceforge.net/project/nagios/nrpe-3.x/nrpe-3.0.1.tar.gz

4) Nagios的安装

tar -zxf nagios-4.2.1.tar.gz -C /usr/localcd /usr/localcd nagios-4.2.1/./configure --with-nagios-user=nagios --with-nagios-group=nagiosmake allmake install                    #安装主程序,CGI以及HTML文件make install-init                 #安装启动脚本/etc/init.d/nagiosmake install-commandmode       #安装与配置目录权限make install-config              #安装配置文件模板# 由于nagios最终将以web的形式进行管理与监控,安装过程中使用make install-webconf命令将生成Apache附加配置文件/etc/httpd/conf.d/nagios.confmake install-webconf

5) Nagios插件安装

tar -zxf nagios-plugins-2.1.2.tar.gz -C /usr/localcd /usr/local/nagios-plugins-2.1.2/./configure --prefix=/usr/local/nagiosmakemake installtar -zxf nrpe-3.0.1.tar.gz  -C /usr/local/cd /usr/localcd nrpe-3.0.1/./configure --prefix=/usr/local/nagiosmake allmake install-pluginmake install-daemonmake install-daemon-configchown nagios:nagios -R /usr/local/nagios

6) 禁用selinux并关闭防火墙

setenforce 0service iptables stop

7) 创建web访问账户

htpasswd -c /usr/local/etc/htpasswd.users tomcat

8) 启动nagios

/etc/init.d/httpd start/etc/init.d/nagios start

9) 修改nagios配置文件

主配置文件:nagios.cfg

主配置文件中使用cfg_file配置项加载其他配置文件,为了方便管理,将两台监控主机创建不同的配置文件,10.20.2.235配置文件名为web1.cfg,10.20.2.236配置文件名为web2.cfg

vi /usr/local/nagios/etc/nagios.cfg  cfg_file=/usr/local/nagios/etc/objects/commands.cfgcfg_file=/usr/local/nagios/etc/objects/contacts.cfgcfg_file=/usr/local/nagios/etc/objects/timeperiods.cfgcfg_file=/usr/local/nagios/etc/objects/templates.cfg # Definitions for monitoring the local (Linux) hostcfg_file=/usr/local/nagios/etc/objects/localhost.cfg #下面两个配置文件需要手动创建出来,用于监控两台web服务器cfg_file=/usr/local/nagios/etc/web1.cfgcfg_file=/usr/local/nagios/etc/web2.cfg……

修改CGI配置文件(cgi.cfg),需要将访问web页面的账号加入进来

vi /usr/local/nagios/etc/cgi.cfg use_authentication=1authorized_for_system_information=nagiosadmin,tomcatauthorized_for_configuration_information=nagiosadmin,tomcatauthorized_for_system_commands=nagiosadmin,tomcatauthorized_for_all_services=nagiosadmin,tomcatauthorized_for_all_hosts=nagiosadmin,tomcatauthorized_for_all_service_commands=nagiosadmin,tomcatauthorized_for_all_host_commands=nagiosadmin,tomcat……

修改命令配置文件(commands.cfg),该文件定义具体的命令实现方式,如发送报警邮件具体使用什么工具、邮件内容格式定义。

vi /usr/local/nagios/etc/objects/commands.cfg ……define command{        command_name    check-host-alive        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5        }……#以下内容需要手动添加,用于进行远程主机监控,需要安装nrpe软件包define command{        command_name    check_nrpe        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$        }

 

修改nrpe配置文件(nrpe.cfg),用于监控远程主机所需要的命令

vi /usr/local/nagios/etc/nrpe.cfg command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Zcommand[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200#下面一行为手动添加command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10%……

修改监控主机配置文件(localhost.cfg),该文件用于设置如何监控本机服务器资源。

vi /usr/local/nagios/etc/objects/localhost.cfg……define host{        use                     linux-server            ; Name of host template to use                                                        ; This host definition will inherit all variables that are defined                                                        ; in (or inherited by) the linux-server host template definition.        host_name               localhost        alias                   localhost        address                 127.0.0.1        } ……define hostgroup{        hostgroup_name  linux-servers ; The name of the hostgroup        alias           Linux Servers ; Long name of the group        members         localhost     ; Comma separated list of hosts that belong to this group        }……

 

创建远程监控配置文件web1.cfg与web2.cfg,用于监控远程服务器系统资源与服务,可以使用localhost.cfg作为参考模板。下面列出web1.cfg的所有内容,web2.cfg 只需要参考web1.cfg的内容修改主机名称、IP地址以及主机名称即可。

define host{        use                     linux-server            ; Name of host template to use                                                        ; This host definition will inherit all variables that are defined                                                        ; in (or inherited by) the linux-server host template definition.        host_name               web1        alias                   test.com        address                 10.20.2.235        }define hostgroup{        hostgroup_name  webs ; The name of the hostgroup        alias           Linux Servers ; Long name of the group        members         web1     ; Comma separated list of hosts that belong to this group        }define service{        use                             generic-service         ; Name of service template to use        host_name                       web1        service_description             PING        check_command                   check_ping!100.0,20%!500.0,60%        notifications_enabled           1        }define service{        use                             generic-service         ; Name of service template to use        host_name                       web1        service_description             Sys_Load        check_command                   check_nrpe!check_load        notifications_enabled           1        } define service{        use                             generic-service         ; Name of service template to use        host_name                       web1        service_description             Current Users        check_command                   check_nrpe!check_users        notifications_enabled           1        }define service{        use                             generic-service         ; Name of service template to use        host_name                       web1        service_description             Total Processes        check_command                   check_nrpe!check_total_procs        notifications_enabled           1        }define service{        use                             generic-service         ; Name of service template to use        host_name                       web1        service_description             SSH        check_command                   check_ssh        notifications_enabled           1        }define service{        use                             generic-service         ; Name of service template to use        host_name                       web1        service_description             HTTP        check_command                   check_http        notifications_enabled           1        }

10) 重新加载nagios配置

其他配置文件不需修改,可以直接使用,重启nagios,重新加载所有的配置

/etc/init.d/nagios restart

3. Nagios监控端部署

下面以web1为例,web2与web1操作一致

1) yum安装nagios插件需依赖的软件包

yum -y install openssl openssl-devel

2) 创建nagios用户和组

useradd -s /sbin/nologin nagios

3) 安装Nagios-plugin

tar -zxf nagios-plugins-2.1.2.tar.gz -C /usr/localcd /usr/local/cd nagios-plugins-2.1.2/./configure makemake install

4) 安装Nrpe

tar -zxf nrpe-3.0.1.tar.gz -C /usr/localcd /usr/local/nrpe-3.0.1/./configure make allmake install-pluginmake install-daemonmake isntall-daemon-configchown -R nagios:nagios /usr/local/nagios

 

5) 修改nrpe配置文件

cp /usr/local/nrpe-3.0.1/sample-config/nrpe.cfg /usr/local/nagios/etc/vi /usr/local/nagios/etc/nrpe.cfg……allowed_hosts=127.0.0.1,10.20.2.233……command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Zcommand[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200#下面一行为手动添加command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10%

6) 禁用selinux并关闭防火墙

setenforce 0service iptables stop

 

7) 启动nrpe

/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d

4. 验证并进行监控

1) 验证监控端的nrpe

管理员在Nagios服务端通过check_nrpe检测被监控端相关的性能参数,单独使用check_nrpe可以检测被监控端的nrpe版本号

[root@test etc]# /usr/local/nagios/libexec/check_nrpe -H 10.20.2.235NRPE v3.0.1[root@test etc]# /usr/local/nagios/libexec/check_nrpe -H 10.20.2.236NRPE v3.0.1[root@test etc]# /usr/local/nagios/libexec/check_nrpe -H 10.20.2.237connect to address 10.20.2.237 port 5666: Connection refusedconnect to host 10.20.2.237 port 5666: Connection refused

2) 访问web页面进行监控

以上信息已经能够检测到被监控端的nrpe,此时可以通过浏览器进行访问:

http://10.20.2.233/nagios

Nagios监控系统安装和配置

Nagios监控系统安装和配置

Nagios监控系统安装和配置




本文出自 “11528244” 博客,请务必保留此出处http://11538244.blog.51cto.com/11528244/1851350