ANOG 16 - Nagios Installation and Configuration
Notes:
------
* Commands preceded with "$" imply that you should execute the command as
a general user - not as root.
* Commands preceded with "#" imply that you should be working as root.
* Commands with more specific command lines (e.g. "RTR-GW>" or "mysql>")
imply that you are executing commands on remote equipment, or within
another program.
Exercises
---------
PART I
------
1. Start by enabling Nagios
* in /etc/rc.conf add the following line:
nagios_enable="YES"
Configuration templates are available in /usr/local/etc/nagios as
*.cfg-sample files. Copy them to *.cfg files where required and
edit to suit your needs. Documentation is available in HTML form
in /usr/local/www/nagios/docs.
Before we can use Nagios, we need to configure it -- we will do
this in Part II
2. Configure Aapche for Nagios
* create /usr/local/etc/apache22/Includes/nagios.conf
In the file, add:
Order deny,allow
Allow from all
Options ExecCGI
AllowOverride AuthConfig
ScriptAlias /nagios/cgi-bin/ /usr/local/www/nagios/cgi-bin/
Alias /nagios/ /usr/local/www/nagios/
Save this file and exit.
3. Create the Web user password file:
# htpasswd -c /usr/local/etc/nagios/htpasswd.users nagiosadmin
New password:
Re-type new password:
We suggest you use your standard user password used in class.
Now, create a .htaccess file to ask for a password when opening
the Nagios page:
# vi /usr/local/www/nagios/.htaccess
In the file, add:
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/etc/nagios/htpasswd.users
require valid-user
Save the file, and exit.
4. The web interface of Nagios should be ready at this point, but
most views won't work since Nagios has not been configured yet!
- Open a browser, and go to
http://wsXX.ws3.conference.sanog.org/nagios/
- At the login prompt, login as:
user: nagiosadmin
pass:
Now we need to configure Nagios
PART II
Configuring Equipment
-----------------------------------------------------------------------------
1. Let's configure Nagios
# cd /usr/local/etc/nagios/
# cp cgi.cfg-sample cgi.cfg <- Web module config
# cp resource.cfg-sample resource.cfg <- Nagios internal config
# cp nagios.cfg-sample nagios.cfg <- Nagios main config
# cd /usr/local/etc/nagios/objects/
# cp commands.cfg-sample commands.cfg <- plugin configuration
# cp contacts.cfg-sample contacts.cfg <- contact people
# cp templates.cfg-sample templates.cfg <- predefined objects
# cp timeperiods.cfg-sample timeperiods.cfg <- timeperiods
# cp localhost.cfg-sample localhost.cfg <- a sample config
This is the most basic and minimal configuration -- we have taken
all the default configuration files and enabled them.
The last file, "localhost.cfg", defines a monitoring configuration
for your own PC (wsXX).
2. Let's verify that nagios is happy with the configuration:
# cd /usr/local/etc/nagios
# nagios -v nagios.cfg
You should see:
... output ...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
We are ready to start nagios!
# /usr/local/etc/rc.d/nagios start
3. You can now go back to the web interface
http://wsXX.ws3.conference.sanog.org/nagios/
... over the next few minutes, Nagios will update the status
for the services on the localhost (wsXX).
You can check out "Hostgroup grid" and "Hostgroup overview"
options in the left menu, then click on "localhost" to get
the details.
Look at the file:
/usr/local/etc/nagios/objects/localhost.cfg
... and try to understand what is being monitored, by comparing
what you see in the "localhost" view in the Nagios web interface,
and the .cfg file.
We need to make a small change to the file:
/usr/local/etc/nagios/nagios.cfg
... so it will automatically read all .cfg files from the
/usr/local/etc/nagios/objects directory, and we don't have
to always edit nagios.cfg to add them.
So edit /usr/local/etc/nagios/nagios.cfg, and make the
following changes:
COMMENT the 4 lines like this:
cfg_file=/usr/local/etc/nagios/objects/commands.cfg
cfg_file=/usr/local/etc/nagios/objects/contacts.cfg
cfg_file=/usr/local/etc/nagios/objects/timeperiods.cfg
cfg_file=/usr/local/etc/nagios/objects/templates.cfg
Comment = add '#' at the beginning, so they look
like this:
# cfg_file=/usr/local/etc/nagios/objects/commands.cfg
# cfg_file=/usr/local/etc/nagios/objects/contacts.cfg
# cfg_file=/usr/local/etc/nagios/objects/timeperiods.cfg
# cfg_file=/usr/local/etc/nagios/objects/templates.cfg
Do the same for
cfg_file=/usr/local/etc/nagios/objects/localhost.cfg
... so it becomes:
# cfg_file=/usr/local/etc/nagios/objects/localhost.cfg
... and add another line:
cfg_dir=/usr/local/etc/nagios/objects
... Now save the file and exit.
One last change:
# cd /usr/local/etc/nagios/objects/
# mv localhost.cfg main.cfg
main.cfg is a nicer name than localhost, since we're going to
be adding new hosts and parameters to it.
Test that nagios is happy:
# nagios -v nagios.cfg
If Nagios complains, double check your changes, and if
it is still a problem, ask one of the instructors for help.
Finally, restart Nagios:
# /usr/local/etc/rc.d/nagios restart
4. Let's start monitoring another computer in our classroom:
- Pick any other WS in the class, which you will monitor.
# cd /usr/local/etc/nagios/objects
# vi ws-all.cfg
define host {
use freebsd-server
host_name wsYY
alias WS YY in WS3
address _______________ [wsYY's IP address here]
}
Note: YY is *another* machine in the class, not your own.
... Save and quit
5. Let's create a new hostgroup for the occasion, and add our host
to it
- Let's add the hostgroup to the "main.cfg" file:
# cd /usr/local/etc/nagios/objects/
# vi main.cfg
Find the section called "Define an optional hostgroup for FreeBSD machines",
and just under it, add:
define hostgroup {
hostgroup_name classroom
alias All WS in the class
members wsYY
}
6. Now let's associate some services to that host
Still in the file "main.cfg", find the section called:
"Define a service to check SSH on the local machine"
and change the line:
host_name localhost
to
hostgroup_name freebsd-servers, classroom
Save the file and exit
7. Verify that your configuration file is OK:
# nagios -v /usr/local/etc/nagios/nagios.cfg
... You should get :
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the check.
8. Reload/Restart Nagios
# /usr/local/etc/rc.d/nagios restart
9. Go to the web interface (http://wsXX.ws3.conference.sanog.org/nagios)
and check the host you just added.
10. Add ALL the PCs (WS1 - WS15) in your classroom.
- Remember to verify the configuration file!
- I suggest that you create a single config file called "ws-all.cfg"
to do this, and put all the hosts in it.
- You will repeat step 4 for each machine.
- When finished, remember to add all the hosts into the "classroom" group
in the file main.cfg. The format of the members statement is:
members wsXX,wsYY,wsZZ,...
11. Reload/Restart Nagios
# /usr/local/etc/rc.d/nagios restart
- Take a look at http://wsXX.ws3.conference.sanog.org/nagios to see your changes.
- Click on the "Status Map" link to see how things look.
PART III
Adding Services
-----------------------------------------------------------------------------
1. Determine what services to add for what devices
- This is core to how you use Nagios and network monitoring tools in
general. So far we are simply checking SSH to see if the machines
are up on our network. The next step is to decide what services you
wish to monitor for each host.
- In this particular class we have:
pcs: All wsXX are running ssh, http and imap/pop
All student pcs are running an snmp daemon
So, let's configure Nagios to check for all of these services for these
devices.
2.) Verify that HTTP is running on the classrom PCs
- In the file main.cfg there is already an entry for the HTTP
service check, so you do not need to create this step.
Instead, you simply need to change "host_name localhost" for
that service, to use a "hostgroup_name classroom", just
like we did in step 6 in part II.
So make this change in the "main.cfg" file -- find the section
"Define a service to check HTTP on the local machine", and update
the line:
host_name localhost
to
hostgroup_name classroom
And save the file. This tells Nagios that the HTTP service is not only
running on the single host "localhost", but on ALL hosts in the hostgroup
"classroom".
- Once you are done, run the pre-flight check:
# nagios -v /usr/local/etc/nagios/nagios.cfg
If everything looks good, then restart Nagios and see your changes in the
Nagios web interface.
3.) Check that all hosts answer to ping.
- Like for HTTP, there is already a check_ping service defined and it automatically
applies to the freebsd-servers group. (Note, you can add additional groups of hosts
for any service check if you wish). So, you need to update the "PING" service
definition in the main.cfg file, and make it use the hostgroup_name classroom,
as we did in the previous step.
- See the previous exercise and make the appropriate change to do this. If you have
any questions ask your instructor for help.
4.) Let's add IMAP and POP monitoring
The *commands* for check_pop and check_imap are already configured,
so all we need to do is create *services* for them.
First, edit the main.cfg file, and at the end, add the service definition
for the check_pop:
define service {
use local-service
hostgroup_name classroom
service_description POP
check_command check_pop
}
Check the configuration (nagios -v ...) and restart Nagios,
then go to the web interface.
Now, you add IMAP check in the same fashion (create service
definition, called "check_imap", etc...). Don't forget to
update the service_description!
5.) One last change...
In the file main.cfg, REMOVE all the lines like:
notifications_enabled 0
(there should be 2 lines like this -- delete them)