References: exercises-nagios.txt

File exercises-nagios.txt, 29.7 KB (added by admin, 7 years ago)

Sample Nagios exercise set

Line 
1
2Nagios Installation and Configuration
3
4Notes:
5------
6* Commands preceded with "$" imply that you should execute the command as
7  a general user - not as root.
8* Commands preceded with "#" imply that you should be working as root.
9* Commands with more specific command lines (e.g. "RTR-GW>" or "mysql>")
10  imply that you are executing commands on remote equipment, or within
11  another program.
12
13Exercises
14---------
15
16Exercises Part I
17----------------
18
190. Log in to your PC or open a terminal window as the sysadm user.
20
211. You may need to install Nagios version 3. You would do this as root or as the sysadmin
22   user and use the "sudo" command. As sysadm:
23
24   $ sudo apt-get install nagios3
25
26   Unless you already have an MTA installed, nagios3 will install
27   postfix as a dependency. Select "Internet Site" option. (If you had wanted
28   to use a different MTA likely you'd install it before nagios3)
29
30   You will be prompted for nagiosadmin password. Give it the normal
31   workshop password.
32
33   To get the documentation in /usr/share/doc/nagios3-doc/html/ (which
34   can also be read via the nagios web interface), do:
35
36    $ sudo apt-get install nagios3-doc
37
38
392. Look at the file which contains the password. It's hashed (encrypted)
40
41    $ cat /etc/nagios3/htpasswd.users
42
43
443. You should already have a working Nagios!
45
46    - Open a browser, and go to your machine like this:
47
48    http://pcN.ws.nsrc.org/nagios3/
49
50    - At the login prompt, login as:
51
52        user: nagiosadmin
53        pass: <CLASS PASSWORD>
54
55    Browse to the "Host Detail" page to see what's already configured.
56
57
584. Let's look at the configuration layout... But, first, let's become the root
59   user on your machine:
60
61    $ sudo bash
62
63    # cd /etc/nagios3
64    # ls -l
65
66    -rw-r--r-- 1 root root    1882 2008-12-18 13:42 apache2.conf
67    -rw-r--r-- 1 root root   10524 2008-12-18 13:44 cgi.cfg
68    -rw-r--r-- 1 root root    2429 2008-12-18 13:44 commands.cfg
69    drwxr-xr-x 2 root root    4096 2009-02-14 12:33 conf.d
70    -rw-r--r-- 1 root root      26 2009-02-14 12:36 htpasswd.users
71    -rw-r--r-- 1 root root   42539 2008-12-18 13:44 nagios.cfg
72    -rw-r----- 1 root nagios  1293 2008-12-18 13:42 resource.cfg
73    drwxr-xr-x 2 root root    4096 2009-02-14 12:32 stylesheets
74
75    # cd conf.d
76    # ls -l   
77
78    -rw-r--r-- 1 root root 1695 2008-12-18 13:42 contacts_nagios2.cfg
79    -rw-r--r-- 1 root root  418 2008-12-18 13:42 extinfo_nagios2.cfg
80    -rw-r--r-- 1 root root 1152 2008-12-18 13:42 generic-host_nagios2.cfg
81    -rw-r--r-- 1 root root 1803 2008-12-18 13:42 generic-service_nagios2.cfg
82    -rw-r--r-- 1 root root  210 2009-02-14 12:33 host-gateway_nagios3.cfg
83    -rw-r--r-- 1 root root  976 2008-12-18 13:42 hostgroups_nagios2.cfg
84    -rw-r--r-- 1 root root 2167 2008-12-18 13:42 localhost_nagios2.cfg
85    -rw-r--r-- 1 root root 1005 2008-12-18 13:42 services_nagios2.cfg
86    -rw-r--r-- 1 root root 1609 2008-12-18 13:42 timeperiods_nagios2.cfg
87
88    Notice that the package installs files with "nagios2" in their name.
89    This is because they are the same files as were used for the Nagios
90    version 2 Debian package. However there was a change made to the
91    host-gateway configuration file, so this has a new name.
92
93
945. You have a config which is already monitoring your own system
95(localhost_nagios2.cfg) and your upstream default gateway
96(host-gateway_nagios3.cfg).
97
98Have a look at the config file for the default gateway: it's very simple.
99(Note: tab completion is useful here. Type cat host-g then hit tab; the
100filename will be filled in for you)
101
102    # cat host-gateway_nagios3.cfg
103
104    # a host definition for the gateway of the default route
105    define host {
106            host_name   gateway
107            alias       Default Gateway
108            address     10.10.0.254
109            use         generic-host
110            }
111
112
113
114PART II
115Configuring Equipment
116-----------------------------------------------------------------------------
117
1180. Order of configuration
119
120Conceptually we will build our configuration files from the "nearest" device
121then the further away ones.
122
123By going in this order you will have defined the devices that act as parents
124for other devices.
125
126Remember to refer to the Network Diagram for our classroom if you get confused.
127
128We recommend creating instances like this:
129
130rtr     (the gateway router: 10.10.0.254)
131sw      (the gateway switch: 10.10.0.253, parent: rtr)
132rtr1    (group 1 router: 10.10.1.254, parent: sw)
133rtr2    (group 2 router: 10.10.2.254, parent: sw))
134rtr3    (group 3 router: 10.10.3.254, parent: sw))
135rtr4    (group 4 router: 10.10.4.254, parent: sw))
136rtr5    (group 5 router: 10.10.5.254, parent: sw))
137rtr6    (group 6 router: 10.10.6.254, parent: sw))
138
139pc1     (pc in group 1: 10.10.1.1, parent: rtr1)
140.
141pc2     (pc in group 2: 10.10.2.2, parent: rtr2)
142.
143pc9     (pc in group 3: 10.10.3.9, parent: rtr3)
144.
145pc10    (pc in group 4: 10.10.4.10, parent: rtr4)
146.
147pc17    (pc in group 5: 10.10.5.17, parent: rtr5)
148.
149pc18    (pc in group 6: 10.10.6.18, parent: rtr6)
150.
151.
152pc26
153
154s1      (on backbone: 10.10.0.241, parent: sw)
155s2      (on backbone: 10.10.0.242, parent: sw)
156noc     (on backbone: 10.10.0.250, parent: sw)
157ap1     (on backbone: 10.10.0.251, parent: sw) 
158ap2     (on backbone: 10.10.0.252, parent: sw)
159
160We recommend placing these items in the files:
161
162routers.cfg             (rtr, rtr1...rtr6)
163switches.cfg            (sw)
164pcs.cfg                 (pc1...pc26, s1, s2, noc, ap1, ap2)
165
166
1671. First we need to tell Nagios to monitor the gateway router for
168   our classroom which is 10.10.0.254:
169
170   # cd /etc/nagios3/conf.d/
171
172Create the routers gateway like this:
173
174   # editor routers.cfg
175
176define host {
177    use         generic-host
178    host_name   rtr
179    alias       Gateway Router
180    address     10.10.0.254
181}
182
183In the same file create the 6 entries for the group routers:
184
185define host {
186    use         generic-host
187    host_name   rtr1
188    alias       Group 1 Router
189    address     10.10.1.254
190    parents     sw
191}
192
193repeate this for rtr2, rtr3, rtr4, rtr5 and rtr6
194
195Note that the entry for "sw" our gateway switch has not yet been created. That is
196next.
197
198Exit and save this file.
199
200
2012. Create a file called switches.cfg and add an entry for this item:
202
203   # editor switches.cfg
204
205define host {
206    use         generic-host
207    host_name   sw
208    alias       Backbone Switch
209    address     10.10.0.253
210    parents     rtr
211}
212
213At this point Nagios is configured to monitor whether our core hosts (the parents)
214are up on our classroom network. Your next steps are to add in the individual hosts
215such as the classroom virtual PC images (pc1 to pc26), the Wireless Access Points
216(ap1 and ap2), the servers s1, s2 and the noc:
217
218Be sure you add in a proper "parents" entry for each host.
219
220Remember, if you don't understand the parent relations in our network you can
221review the logical network diagram here:
222
223        http://noc.ws.nsrc.org/wiki/wiki/NetworkDiagram
224
225Note the Nagios parent bullet points:
226
227Nagios Parent Relationships
228
229
230STEPS 2a - 2c SHOULD BE REPEATED WHENEVER YOU UPDATE THE CONFIGURATION!
231   
232
2332a. Verify that your configuration files are OK:
234
235    # nagios3 -v /etc/nagios3/nagios.cfg
236
237    ... You should get some warnings like :
238Warning: Host 'rtr' has no services associated with it!
239Warning: Host 'sw' has no services associated with it!
240etc....
241...
242Total Warnings: N
243Total Errors:   0
244
245Things look okay - No serious problems were detected during the check.
246Nagios is saying that it's unusual to monitor a device just for its
247existence on the network, without also monitoring some service.
248
249
2502b. Reload/Restart Nagios
251
252    # /etc/init.d/nagios3 restart
253
254Not always 100% reliable to use the "restart" option due to a bug in the Nagios init script.
255To be sure you may want to get used to doing:
256
257    # /etc/init.d/nagios3 stop
258    # /etc/init.d/nagios3 start
259
260
2612c. Go to the web interface (http://pcN.ws.nsrc.org/nagios3) and check that the hosts
262   you just added are now visible in the interface. Click on the "Host Detail" item
263   on the left of the Nagios screen to see this. You may see it in "PENDING"
264   status until the check is carried out.
265
266
267HINT: You will be doing this a lot. If you do it all on one line, like this,
268then you can hit cursor-up and rerun all in one go:
269
270    nagios3 -v /etc/nagios3/nagios.cfg && /etc/init.d/nagios3 restart
271
272The '&&' ensures that the restart only happens if the config is valid.
273
274
2753. Create entries for the classroom PCs
276
277Now that we have our routers and switches defined it is quite easy to create
278entries for all our PCs.  Think about the parent relationships:
279
280Remember, if you do not understand the parent relationship refer back to the
281classroom network diagram here:
282
283        http://noc.ws.nsrc.org/wiki/wiki/NetworkDiagram
284
285Below are three sample entries. One for the NOC, one for pc1 and one for
286pc26.  You should be able to use this example to create entries for all
287classroom PCs plus the NOC.
288
289We could put these entries in to separate files, but as our network is small
290we'll use a single file called pcs.cfg.
291
292NOTE! You do not add in an entry for your own PC or router. This has already
293been defined in the file /etc/nagios3/conf.d/localhost_nagios2.cfg.  This
294definition is what defines the Nagios network viewpoint. So, when you come to
295the spot where you might add an entry for your PC you should skip this and go
296on to the next PC in the list.
297
298        # editor pcs.cfg
299       
300# Our classroom NOC
301
302define host {
303    use         generic-host
304    host_name   noc
305    alias       Workshop NOC machine
306    address     10.10.0.250
307    parents     sw
308}
309
310# PCs
311
312define host {
313    use         generic-host
314    host_name   pc1
315    alias       pc1
316    address     10.10.1.1
317    parents     rtr1
318}
319
320define host {
321    use         generic-host
322    host_name   pc26
323    alias       pc26
324    address     10.10.6.26
325    parents     rtr6
326}
327
328Pay attention to the parent entries and the IP addresses.
329
330Take the three entries above and now expand this to create the remaining
331entries for all active PCs. That is, fill in for PCs 2 through 25 (rememember to
332skip your PC).
333
334
335Exit and save the file pcs.cfg
336
337As before, repeat steps 2a-2c to verify your configuration, correct any
338errors, and activate it.
339
3405. Look at your Nagios instance on the web. Note that "Status Map" gives
341you a graphical view of the parent-child relationships you have defined.
342
343
344PART III
345Configure Service check for the classroom NOC
346-----------------------------------------------------------------------------
347
3480. Configuring
349
350Now that we have our hardware configured we can start telling Nagios what services to monitor
351on the configured hardware, how to group the hardware in interesting ways, how to group
352services, etc.
353
3541. Associate a service check for our classroom NOC
355
356    # editor hostgroups_nagios2.cfg
357
358    - Find the hostgroup named "ssh-servers". In the members section of the defintion
359      change the line:
360
361members                 localhost
362
363    to
364
365members                 localhost,noc
366
367Exit and save the file.
368
369Verify that your changes are OK:
370
371        # nagios3 -v /etc/nagios3/nagios.cfg
372       
373Restart Nagios to see the new service assocation with your host:
374
375        # /etc/init.d/nagios3 restart
376
377Click on the "Service Detail" link in the Nagios web interface to see your new entry.
378
379
380PART IV
381Defining Services for all PCs
382-----------------------------------------------------------------------------
383
3840. For services, the default normal_check_interval is 5 (minutes) in
385   generic-service_nagios2.cfg. You may wish to change this to 1 to speed up
386   how quickly service issues are detected, at least in the workshop.
387
3881. Determine what services to define for what devices
389
390   - This is core to how you use Nagios and network monitoring tools in
391     general. So far we are simply using ping to verify that physical hosts
392     are up on our network and we have started monitoring a single service on
393     a single host (your PC). The next step is to decide what services you wish
394     to monitor for each host in the classroom.
395
396   - In this particular class we have:
397
398     routers:  running ssh and snmp
399     switches: running telnet and possibly ssh as well as snmp
400     pcs:      All PCs are running ssh and http and should be running snmp
401               The NOC is currently running an snmp daemon
402             
403     So, let's configure Nagios to check for these services for these
404     devices.
405
4062.) Verify that SSH is running on the routers and workshop PCs images
407
408   - In the file services_nagios2.cfg there is already an entry for the SSH
409     service check, so you do not need to create this step. Instead, you
410     simply need to re-define the "ssh-servers" entry in the file
411     /etc/nagios3/conf.d/hostgroups_nagios2.cfg. The initial entry in the file
412     looked like:
413
414# A list of your ssh-accessible servers
415define hostgroup {
416        hostgroup_name  ssh-servers
417                alias           SSH servers
418                members         localhost,noc
419        }
420
421     What do you think you should change? Correct, the "members" line. You should
422     add in entries for all the classroom pcs, routers and  the switches that run ssh.
423     With this information and the network diagram you should be able complete this entry.
424     
425     The entry will look something like this:
426
427define hostgroup {
428        hostgroup_name  ssh-servers
429                alias           SSH servers
430                members         localhost,pc1,pc2,pc3,pc4...,pc26,....ap1,ap2,s1,s2,noc,rtr1
431        }
432
433         Note: leave in "localhost" - This is your PC and represents Nagios' network point of
434         view. So, for instance, if you are on "pc3" you would not include "pc3" in the list
435         of all the classroom pcs as it is represented by the "localhost" entry.
436         
437         The "members" entry will be a long line and will likely wrap on the screen.
438
439         Remember to include all your PCs and all your routers that you have defined. Do no
440         include any entries if they are not already defined in pcs.cfg, switches.cfg or
441         routers.cfg.
442
443    - Once you are done, run the pre-flight check:
444
445    # nagios3 -v /etc/nagios3/nagios.cfg
446
447    If everything looks good, then restart Nagios
448
449    # /etc/init.d/nagios3 stop
450    # /etc/init.d/nagios3 start
451
452    and view your changes in the Nagios web interface.
453
454To continue with hostgroups you can add additional groups for later use, such as all our virtual
455servers. Go ahead and edit the file hostgroups_nagios2.cfg again:
456
457     # editor hostgroups_nagios2.cfg
458
459and add the following to the end of the file:
460
461# A list of our virtual routers
462define hostgroup {
463        hostgroup_name  cisco7200
464                alias           Cisco 7200 Routers
465                members         rtr1,rtr2,rtr3,rtr4,rtr5,rtr6
466        }
467
468Save and exit from the file. Verify that everything is OK:
469
470    # nagios3 -v /etc/nagios3/nagios.cfg
471
472    If everything looks good, then restart Nagios
473
474    # /etc/init.d/nagios3 stop
475    # /etc/init.d/nagios3 start
476
4773.) Check that http is running on all the classroom PCs.
478
479    - This is almost identical to the previous exercise. Just make the change to the
480      HTTP service adding in each PC (no routers or switches). Remember, you don't need
481      to add your machine as it is already defined as "localhost".     
482
4834.)  OPTIONAL EXTRA: as opposed to just checking that a web server is
484     running on the classroom PCs, you could also check that the nagios3
485     service is available, by requesting the /nagios3/ path. This means
486     passing extra options to the check_http plugin.
487
488     For a description of the available options, type this:
489
490      # /usr/lib/nagios/plugins/check_http
491      # /usr/lib/nagios/plugins/check_http --help
492
493     and of course you can browse the online nagios documentation or google
494     for information on check_http. You can even run the plugin by hand to
495     perform a one-shot service check:
496
497     # /usr/lib/nagios/plugins/check_http -H localhost -u /nagios3/
498
499     So the goal is to configure nagios to call check_http in this way.
500
501define command{
502        command_name    check_http_arg
503        command_line    /usr/lib/nagios/plugins/check_http -H '$HOSTADDRESS$' $ARG1$
504        }
505
506define service {
507        hostgroup_name                  nagios-servers
508        service_description             NAGIOS
509        check_command                   check_http_arg!-u /nagios3/
510        use                             generic-service
511}
512
513     and of course you'll need to create a hostgroup called nagios-servers to
514     link to this service check.
515
516     Once you have done this, check that Nagios warns you about failing
517     authentication (because it's trying to fetch the page without providing
518     the username/password). There's an extra parameter you can pass to
519     check_http_arg to provide that info, see if you can find it.
520
521      WARNING: in the tradition of "Debian Knows Best", their definition of the
522      check_http command in /etc/nagios-plugins/config/http.cfg
523      is *not* the same as that recommended in the nagios3 documentation.
524      It is missing $ARG1$, so any parameters to pass to check_http are
525      ignored. So you might think you are monitoring /nagios3/ but actually
526      you are monitoring root!
527
528     This is why we had to make a new command definition "check_http_arg".
529     You could make a more specific one like "check_nagios", or you could
530     modify the Ubuntu check_http definition to fit the standard usage.
531
532
533
534PART V
535Create More Host Groups
536-----------------------------------------------------------------------------
537
5380. In the web view, look at the pages "Hostgroup Overview", "Hostgroup
539   Summary", "Hostgroup Grid". This gives a convenient way to group together
540   hosts which are related (e.g. in the same site, serving the same purpose).
541
5421. Update /etc/nagios3/conf.d/hostgroups_nagios2.cfg
543
544    - For the following exercises it will be very useful if we have created
545      or update the following hostgroups:
546
547      debian-servers
548      routers
549      switches
550 
551      If you edit the file /etc/nagios3/conf.d/hostgroups_nagios2.cfg you
552      will see an entry for debian-servers that just contains localhost.
553      Update this entry to include all the classroom PCs, including the
554      noc (this assumes that you created a "noc" entry in your pcs.cfg
555      file). Remember to skip your PC entry as it is represented by the
556      localhost entry.
557
558    # editor /etc/nagios3/conf.d/hostgroups_nagios2.cfg
559
560     Update the entry that says:
561
562
563# A list of your Debian GNU/Linux servers
564define hostgroup {
565        hostgroup_name  debian-servers
566                alias           Debian GNU/Linux Servers
567                members         localhost
568        }
569     
570      So that the "members" parameter contains something like this. Use your
571      classroom network diagram to confirm the exact number of machines and names
572      in your workshop.
573
574                members         localhost,pc1,pc2,pc3,pc4,pc5,pc6,pc7,pc8,pc9
575                                pc10,pc11,pc12,pc13,pc14,pc15,pc16,pc17,pc18,
576                                pc19,pc20,pc21,pc22,pc23,pc24,pc25,pc26
577
578        Be sure that the line wraps and is not on two separate lines. Otherwise
579        you will get an error when you go to restart Nagios. Remember that
580        your own PC is "localhost".
581
582      - Once you have done this, add in two more host groups, one for routers and
583        one for switches. Call these entries "routers" and "switches".
584
585      - When you are done be sure to verify your work and restart Nagios.
586
587      - Remember to skip your pc entry as it is represented by the localhost entry.
588 
5892. Go back to the web interface and look at your new hostgroups
590
591
592PART VI
593Extended Host Information ("making your graphs pretty")
594-----------------------------------------------------------------------------
595
5961. Update extinfo_nagios2.cfg
597
598    - If you would like to use appropriate icons for your defined hosts in
599      Nagios this is where you do this. We have the three types of devices:
600
601      Cisco routers
602      Cisco switches
603      Ubuntu servers
604
605      There is a fairly large repository of icon images available for you to
606      use located here:
607
608      /usr/share/nagios/htdocs/images/logos/
609
610      these were installed by default as dependent packages of the nagios3
611      package in Ubuntu. In some cases you can find model-specific icons for
612      your hardware, but to make things simpler we will use the following
613      icons for our hardware:
614
615      /usr/share/nagios/htodcs/images/logos/base/debian.*
616      /usr/share/nagios/htdocs/images/logos/cook/router.*
617      /usr/share/nagios/htdocs/images/logos/cook/switch.*
618
619    - The next step is to edit the file /etc/nagios3/conf.d/extinfo_nagios2.cfg
620      and tell nagios what image you would like to use to represent your devices.
621
622    # editor /etc/nagios3/conf.d/extinfo_nagios2.cfg
623
624      Here is what an entry for your routers looks like (there is already an entry
625      for debian-servers that will work as is). Note that the router model (3600)
626      is not all that important. The image used represents a router in general.
627
628define hostextinfo {
629        hostgroup_name   routers
630        icon_image       cook/router.png
631        icon_image_alt   Cisco Routers (3600)
632        vrml_image       router.png
633        statusmap_image  cook/router.gd2
634}
635
636      Now add an entry for your switches. Once you are done check your
637      work and restart Nagios. Take a look at the Status Map in the web interface.
638      It should be much nicer, with real icons instead of question marks.
639
640
641PART VII
642Create Service Groups
643-----------------------------------------------------------------------------
644
6451. Create service groups for ssh and http for each set of pcs.
646
647   - The idea here is to create three service groups. Each service group will
648     be for a quarter of the classroom. We want to see these PCs grouped together
649     and include status of their ssh and http services. To do this edit
650     and create the file:
651
652   # editor /etc/nagios3/conf.d/servicegroups.cfg
653
654     Here is a sample of the service group for group 1:
655
656define servicegroup {
657        servicegroup_name       group1-servers
658        alias                   group 1 servers
659        members                 pc1,SSH,pc1,HTTP,pc2,SSH,pc2,HTTP,pc3,SSH,pc3,HTTP,pc4,SSH,pc4
660        }
661
662        - Note that the members line should wrap and not be on two lines.
663       
664        - Note that "SSH" and "HTTP" need to be uppercase as this is how the service_description is
665          written in the file /etc/nagios3/conf.d/services_nagios2.cfg
666         
667        - You should create an entry for other groups of servers too
668
669    - Save your changes, verify your work and restart Nagios. Now if you click on
670      the Servicegroup menu items in the Nagios web interface you should see
671      this information grouped together.
672
673
674
675PART VIII
676Configure Guest Access to the Nagios Web Interface
677-----------------------------------------------------------------------------
678
6791. Edit /etc/nagios3/cgi.cfg to give read-only guest user access to the Nagios
680   web interface.
681
682    - By default Nagios is configured to give full r/w access via the Nagios
683      web interface to the user nagiosadmin. You can change the name of this
684      user, add other users, change how you authenticate users, what users
685      have access to what resources and more via the cgi.cfg file.
686
687    - First, lets create a "guest" user and password in the htpasswd.users
688      file.
689     
690    # htpasswd /etc/nagios3/htpasswd.users guest
691
692      You can use any password you want (or none). A password of "guest" is
693      not a bad choice.
694
695    - Next, edit the file /etc/nagios3/cgi.cfg and look for what type of access
696      has been given to the nagiosadmin user. By default you will see the following
697      directives (note, there are comments between each directive):
698
699      authorized_for_system_information=nagiosadmin
700      authorized_for_configuration_information=nagiosadmin
701      authorized_for_system_commands=nagiosadmin
702      authorized_for_all_services=nagiosadmin
703      authorized_for_all_hosts=nagiosadmin
704      authorized_for_all_service_commands=nagiosadmin
705      authorized_for_all_host_commands=nagiosadmin
706
707      Now let's tell Nagios to allow the "guest" user some access to
708      information via the web interface. You can choose whatever you would
709      like, but what is pretty typical is this:
710
711      authorized_for_system_information=nagiosadmin,guest
712      authorized_for_configuration_information=nagiosadmin,guest
713      authorized_for_system_commands=nagiosadmin
714      authorized_for_all_services=nagiosadmin,guest
715      authorized_for_all_hosts=nagiosadmin,guest
716      authorized_for_all_service_commands=nagiosadmin
717      authorized_for_all_host_commands=nagiosadmin
718
719    - Once you make the changes, save the file cgi.cfg, verify your
720      work and restart Nagios.
721
722    - To see if you can log in as the "guest" user you may need to clear
723      the cookies in your web browser. You will not notice any difference
724      in the web interface. The difference is that a number of items that
725      are available via the web interface (forcing a service/host check,
726      scheduling checks, comments, etc.) will not work for the guest
727      user.
728
729
730OPTIONAL
731--------
732
733* Check that SNMP is running on the classroom NOC
734
735    - First you will need to add in the appropriate service check for SNMP in the file
736      /etc/nagios3/conf.d/services_nagios2.cfg. This is where Nagios is impressive. There
737      are hundreds, if not thousands, of service checks available via the various Nagios
738      sites on the web. You can see what plugins are installed by Ubuntu in the nagios3
739      package that we've installed by looking in the following directory:
740
741    # ls /usr/lib/nagios/plugins
742
743      As you'll see there is already a check_snmp plugin available to us. If you are
744      interested in the options the plugin takes you can execute the plugin from the
745      command line by typing:
746
747    # /usr/lib/nagios/plugins/check_snmp
748    # /usr/lib/nagios/plugins/check_snmp --help
749
750      to see what options are available, etc. You can use the check_snmp plugin and
751      Nagios to create very complex or specific system checks.
752
753    - Now to see all the various service/host checks that have been created using the
754      check_snmp plugin you can look in /etc/nagios-plugins/config/snmp.cfg. You will
755      see that there are a lot of preconfigured checks using snmp, including:
756
757      snmp_load
758      snmp_cpustats
759      snmp_procname
760      snmp_disk
761      snmp_mem
762      snmp_swap
763      snmp_procs
764      snmp_users
765      snmp_mem2
766      snmp_swap2
767      snmp_mem3
768      snmp_swap3
769      snmp_disk2
770      snmp_tcpopen
771      snmp_tcpstats
772      snmp_bgpstate
773      check_netapp_uptime
774      check_netapp_cupuload
775      check_netapp_numdisks
776      check_compaq_thermalCondition
777     
778      And, even better, you can create additional service checks quite easily.
779      For the case of verifying that snmpd (the SNMP service on Linux) is running we
780      need to ask SNMP a question. If we don't get an answer, then Nagios can assume
781      that the SNMP service is down on that host. When you use service checks such as
782      check_http, check_ssh and check_telnet this is what they are doing as well.
783
784    - In our case, let's create a new service check and call it "check_system". This
785      service check will connect with the specified host, use the private community
786      string we have defined in class and ask a question of snmp on that ask - in this
787      case we'll ask about the System Description, or the OID "sysDescr.0" -
788
789    - To do this start by editing the file /etc/nagios-plugins/config/snmp.cfg:
790
791    # joe /etc/nagios-plugins/config/snmp.cfg
792
793      At the top (or the bottom, your choice) add the following entry to the file:
794
795# 'check_system' command definition
796define command{
797       command_name    check_system
798       command_line    /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -C
799'$ARG1$' -o sysDescr.0
800        }
801     
802      You may wish to copy and paste this vs. trying to type this out.
803
804          Note that "command_line" is a single line. If you copy and paste in joe the line
805          may not wrap properly and you may have to manually add the part:
806         
807                        '$ARG1$' -o sysDescr.0
808                       
809          to the end of the line.
810
811    - Now you need to edit the file /etc/nagios3/conf.d/services_nagios2.cfg and add
812      in this service check. We'll run this check against all our servers in the
813      classroom, or the hostgroup "debian-servers"
814
815    - Edit the file /etc/nagios3/conf.d/services_nagios2.cfg
816
817    # joe /etc/nagios3/conf.d/services_nagios2.cfg
818
819      At the bottom of the file add the following definition:
820
821# check that snmp is up on all servers
822define service {
823        hostgroup_name                  snmp-servers
824        service_description             SNMP
825        check_command                   check_system!xxxxxx
826        use                             generic-service
827        notification_interval           0 ; set > 0 if you want to be renotified
828}
829
830      The "xxxxxx" is the community string previously (or to be) defined in class.
831     
832      Note that we have included our private community string here vs. hard-coding
833      it in the snmp.cfg file earlier. You must change the "xxxxx" to be the snmp
834      community string given in class or this check will not work.
835     
836    - Now we must create the "snmp-servers" group in our hostgroups_nagios2.cfg file.
837      Edit the file /etc/nagios3/conf.d/hostgroups_nagios2.cfg and go to the end of the
838      file. Add in the following hostgroup definition:
839     
840# A list of snmp-enabled devices on which we wish to run the snmp service check
841define hostgroup {
842           hostgroup_name       snmp-servers
843                   alias        snmp servers
844                   members      noc
845          }
846         
847        - Note that for "members" you could, also, add in the switches and routers for
848          group 1 and 2. But, the particular item (MIB) we are checking for "sysDescr.0"
849          may not be available on the switches and/or routers, so the check would then fail.
850
851    - Now verify that your changes are correct and restart Nagios.
852
853    - If you click on the Service Detail menu choice in web interface you should see
854      the SNMP check appear for the noc host.
855     
856    - After we do the SNMP presentation and exercises in class, then you could come
857      back to this exercise and add in all the classroom PCs to the members list in the
858      hostgroups_nagios2.cfg file, snmp-servers hostgroup definition. Remember to list
859      your PC as "localhost".
860
861