Detailed explanation of Nagios remote monitoring installation and configuration

2. Add new configuration files
First create a simple configuration file, with the following contents:

define timeperiod{
    timeperiod_name 24x7
    alias      24 Hours A Day, 7 Days A Week
    sunday     00:00-24:00
    monday     00:00-24:00
    tuesday     00:00-24:00
    wednesday    00:00-24:00
    thursday    00:00-24:00
    friday     00:00-24:00
    saturday    00:00-24:00
    }

The definition of this file is clear and easy to understand and will not be explained much. It is also recommended to monitor 7X24 hours a day.
The second manually created configuration file is , its format is as follows:

define contact {
    contact_name     sa  //Don't have spaces    alias        system administrator
    service_notification_period  24x7
    host_notification_period    24x7
    service_notification_options  w,u,c,r
    host_notification_options    d,u,r
    service_notification_commands service-notify-by-sms,service-
notify-by-email //This command reads the configuration file    host_notification_commands   host-notify-by-email,host-noti
fy-by-sms   //This command reads the configuration file    email             sery@
    pager             13333333333 //Mobile phone number, receive alarm message    }   //Don't write this symbol off
define contact {
    contact_name     sery
    alias        system administrator
    service_notification_period  24x7
    host_notification_period    24x7
    service_notification_options  w,u,c,r
    host_notification_options    d,u,r
    service_notification_commands service-notify-by-sms,service-
notify-by-email
    host_notification_commands   host-notify-by-email,host-noti
fy-by-sms
    email             sery@
    pager             13312345678
    }

The above file defines 2 contacts. If there are more contacts, add them in this format. Several options for service notification options (service_notification_options) and host notification options (host_notification_options) are explained here: w-warning, u-unknown, c-critical, r-recovery; d-down, u-unreachable, note that there are some differences between host alarms and service alarms.
The third manually created configuration file is a file. This file is based on the previous file. The contactgroups file is relatively simple, and its format is as follows:

define contactgroup {
    contactgroup_name  sagroup //Don't use spaces    alias        system administrator group
    members       sa,sery //This example has 2 members}

Multiple members use commas as delimiters. If there are more contact groups, the remaining groups will be added to the file in the same format.
The key role has finally appeared, and this is the configuration file. Here are the basic styles of the two hosts I defined:

#define monitor host

#################################################################
# Wangjing IDC servers                     #
#################################################################
define host {
    host_name         nagios-server
    alias           nagios server
    address          ..x.49
    contact_groups       sagroup //Multiple contact groups are separated by commas，
Data sourced from
    check_command       check-host-alive
    max_check_attempts     5
    notification_interval   10  //The value is adjustable, and you need to determine what size is appropriate.    notification_period    24x7
    notification_options    d,u,r
    }

define host {
    host_name         24-25
    alias           server 24-25
    address          .24.25
    contact_groups       sagroup
    check_command       check-host-alive //The down machine sends an alarm notification    max_check_attempts     5
    notification_interval   10
    notification_period    24x7
    notification_options    d,u,r
    }

More hosts are added one by one according to this format. Tips: If it is a continuous IP segment, it is best to write a script to generate a file by yourself. For the convenience of future maintenance, use easy-to-read comments in the file as much as possible (such as in this example # Wangjing IDC servers #).
Another heavyweight configuration file is that without this file, no monitoring is useless. Here is a style file:

#service definition

##############################################################
# Wangjing IDC servers service for host-live        #
##############################################################
define service {
    host_name    nagios-server //source:    service_description  check-host-alive
    check_period     24x7
    max_check_attempts  4
    normal_check_interval 3
    retry_check_interval 2
    contact_groups    sagroup //source:    notification_interval  10
    notification_period   24x7
    notification_options  w,u,c,r
    check_command      check-host-alive //Check whether the host is alive    }
define service {
    host_name    74-210
    service_description  check_tcp 80
    check_period     24x7
    max_check_attempts  4
    normal_check_interval 3
    retry_check_interval 2
    contact_groups    sagroup
    notification_interval  10
    notification_period   24x7
    notification_options  w,u,c,r
    check_command   check_tcp!80 //Check whether the tcp 80 port service is normal    }

When writing, you should pay attention to the delimiter between check_tcp and the service port to be monitored. If there are too many services, you should consider using scripts to generate.
Host group configuration file, an optional project, built on top of file hosts, in the format as follows:

define hostgroup {
     hostgroup_name sa-servers
     alias      sa servers
     members     nagios-server,24-25,24-26 //Separate multiple hosts with commas     }

Multiple host groups are added one by one according to the above format. A screenshot of a host group is given later.

After a lot of hardships, I finally saved these configurations. Now I can’t wait to run the program/usr/local/nagios –v /usr/local/nagios/etc/To check the correctness of all configuration files. If you are very lucky, the run will appear at the end of the output:

Total Warnings: 0
Total Errors:  0

Things look okay - No serious problems were detected during the pre-flight check

This situation was accomplished; but I was not so lucky to have modified many places before I succeeded. Thankfully, this verification error report is very useful (not like some system help documents that are not useful). Look at the output generated by an error that I deliberately set:

[root@netmonitor nagios]# bin/nagios -v etc/

Nagios 2.5
Copyright (c) 1999-2006 Ethan Galstad ()
Last Modified: 07-13-2006
License: GPL

Reading configuration data...

Error: Could not find any host matching 'nagios-server'
Error: Could not expand member hosts specified in hostgroup 
(config file '/usr/local/nagios/etc/', starting on line 2)
………………………

It tells me where the configuration file is causing an error (actually I deliberately added a comment symbol to the configuration file to test). After verification is passed, you can execute the command/usr/local/nagios –d /usr/local/nagios/etc/Use nagios as a daemon. Then use ps –aux | grep nagios to see if the process is running. At this point, the nagios service has basically been configured. When making or distribute configurations, you can use some small tricks to reduce the probability of errors: for example, define a few hosts and services first, and then add them after verification is correct.
acceptance
Use the browser to enter the IP and directory of the server where Nagios is located, such ashttp://61...X/nagios, and then enter the username and password required for verification, and then click the relevant connection on the right side of the page to view various statuses. Turn off a service that is monitored by the nagios host or unplug the network cable of a server. Wait for a few minutes, click the hyperconnection "Service Detail" to observe the status of the page to see if there is a red eye-catching alarm.

After a while, you will receive an alarm message and an alarm email. Then, after turning on all the test services or check the unplugged network cable, the red alarm form on the web page will disappear, and the mobile phone message or email notification will be notified to recover. If your situation is the same, then it's really done.
Nagios is very powerful. In my project, it simplifies nagios as much as possible because of my different needs without using proxy, more plug-ins, etc. It works very well in a network scale of no more than 1,000 servers. If there are more servers, it is recommended to use mysql data to manage monitoring objects. During the deployment of nagios, I made many choices. For more details, please refer to the official documentation.