Agenda: exercises-nagios-with-router.txt

File exercises-nagios-with-router.txt, 29.2 KB (added by admin, 7 years ago)
Line 
1
2Nagios Installation and Configuration
3
4Note: all of the commands in this exercise need to be run as root. So
5rather than put 'sudo' in front of every one, please start a root shell:
6
7    $ sudo bash
8    #
9
10The '#' prompt indicates that you are at a root shell, and is shown
11in examples below - but don't type the '#' as part of the command!
12
13
14Exercises
15---------
16
17PART I
18------
19
200. Log in to your PC as the sysadm user.
21
22
231. Install Nagios
24
25    Install Nagios version 3:
26
27        # apt-get install nagios3
28
29   Unless you already have an MTA installed, nagios3 will install
30   postfix as a dependency. If you are prompted for this, select
31   "Internet Site" option. (If you had wanted to use a different MTA like
32   exim you'd install it before nagios3)
33
34   You will be prompted to choose a nagiosadmin password. Give it the normal
35   workshop password.
36
37   To get the documentation in /usr/share/doc/nagios3-doc/html/ (which
38   can also be read via the nagios web interface), do:
39
40       # apt-get install nagios3-doc
41
42
433. You should already have a working Nagios!
44
45    - Open a browser, and go to
46
47    http://pcX/nagios3/
48
49        Check with the instructor or your neighbor if you are in doubt.
50
51    - At the login prompt, login as:
52
53        user: nagiosadmin
54        pass: <workshop password>
55
56    Browse to the "Host Detail" page to see what's already configured.
57
58
594. Let's look at the configuration layout...
60
61    # cd /etc/nagios3
62    # ls -l
63
64    -rw-r--r-- 1 root root    1882 2008-12-18 13:42 apache2.conf
65    -rw-r--r-- 1 root root   10524 2008-12-18 13:44 cgi.cfg
66    -rw-r--r-- 1 root root    2429 2008-12-18 13:44 commands.cfg
67    drwxr-xr-x 2 root root    4096 2009-02-14 12:33 conf.d
68    -rw-r--r-- 1 root root      26 2009-02-14 12:36 htpasswd.users
69    -rw-r--r-- 1 root root   42539 2008-12-18 13:44 nagios.cfg
70    -rw-r----- 1 root nagios  1293 2008-12-18 13:42 resource.cfg
71    drwxr-xr-x 2 root root    4096 2009-02-14 12:32 stylesheets
72
73    # cd conf.d
74    # ls -l   
75
76    -rw-r--r-- 1 root root 1695 2008-12-18 13:42 contacts_nagios2.cfg
77    -rw-r--r-- 1 root root  418 2008-12-18 13:42 extinfo_nagios2.cfg
78    -rw-r--r-- 1 root root 1152 2008-12-18 13:42 generic-host_nagios2.cfg
79    -rw-r--r-- 1 root root 1803 2008-12-18 13:42 generic-service_nagios2.cfg
80    -rw-r--r-- 1 root root  210 2009-02-14 12:33 host-gateway_nagios3.cfg
81    -rw-r--r-- 1 root root  976 2008-12-18 13:42 hostgroups_nagios2.cfg
82    -rw-r--r-- 1 root root 2167 2008-12-18 13:42 localhost_nagios2.cfg
83    -rw-r--r-- 1 root root 1005 2008-12-18 13:42 services_nagios2.cfg
84    -rw-r--r-- 1 root root 1609 2008-12-18 13:42 timeperiods_nagios2.cfg
85
86    Notice that the package installs files with "nagios2" in their name.
87    This is because they are the same files as were used for the Nagios
88    version 2 Debian package. However there was a change made to the
89    host-gateway configuration file, so this has a new name.
90
91
925. You have a config which is already monitoring your own system
93(localhost_nagios2.cfg) and your upstream default gateway
94(host-gateway_nagios3.cfg).
95
96Have a look at the config file for the default gateway: it's very simple.
97(Note: tab completion is useful here. Type "cat host-g" then hit tab; the
98filename will be filled in for you)
99
100    # cat host-gateway_nagios3.cfg
101
102It should look something like this:
103
104    # a host definition for the gateway of the default route
105    define host {
106            host_name   gateway
107            alias       Default Gateway
108            address     10.10.X.254
109            use         generic-host
110            }
111
112It is monitoring the virtual Cisco router which is upstream of your VM.
113
114
115
116PART II
117Configuring Equipment
118-----------------------------------------------------------------------------
119
1200. Order of configuration
121
122Conceptually we will build our configuration files from the "nearest" device
123then the further away ones.
124
125By going in this order you will have defined the devices that act as parents
126for other devices.
127
128Your upstream Cisco virtual router (your PC's gateway) is already defined.
129
1301. The three PCs in your group are directly connected to you with nothing in
131between.  So there are no dependencies.
132
133Create a new file, 'pcs.cfg', to list the three other PCs in your group. The
134example below is ONLY for pc1, which has pc2/pc3/pc4 in its group, so modify
135it for your neighbours.
136
137    # cd /etc/nagios3/conf.d/
138    # editor pcs.cfg
139
140define host {
141    use         generic-host
142    host_name   pc1
143    alias       pc1 in group 1
144    address     pc1.ws.nsrc.org
145}
146
147define host {
148    use         generic-host
149    host_name   pc2
150    alias       pc2 in group 1
151    address     pc2.ws.nsrc.org
152}
153
154define host {
155    use         generic-host
156    host_name   pc3
157    alias       pc3 in group 1
158    address     pc3.ws.nsrc.org
159}
160
161
162THE FOLLOWING STEPS 2a - 2c SHOULD BE REPEATED WHENEVER YOU UPDATE THE CONFIGURATION!
163   
164
1652a. Verify that your configuration files are OK:
166
167    # nagios3 -v /etc/nagios3/nagios.cfg
168
169    ... You should get something like this:
170Warning: Host 'pc2' has no services associated with it!
171Warning: Host 'pc3' has no services associated with it!
172Warning: Host 'pc4' has no services associated with it!
173...
174Total Warnings: 3
175Total Errors:   0
176
177Things look okay - No serious problems were detected during the check.
178Nagios is saying that it's unusual to monitor a device just for its
179existence on the network, without also monitoring some service.
180
181
1822b. Reload/Restart Nagios
183
184    # service nagios3 restart
185
186
187HINT: You will be doing this a lot. If you do it all on one line, like this,
188then you can hit cursor-up and rerun all in one go:
189
190    # nagios3 -v /etc/nagios3/nagios.cfg && service nagios3 restart
191
192The '&&' ensures that the restart only happens if the config is valid.
193
194
1952c. Go to the web interface (http://pcX/nagios3) and check that the hosts
196   you just added are now visible in the interface. Click on the "Host Detail" item
197   on the left of the Nagios screen to see this. You may see it in "PENDING"
198   status until the check is carried out.
199
200
201
2023. Let's configure Nagios to start monitoring the classroom switch and then
203the backbone router.
204
205Add the switch in a new file:
206
207    # cd /etc/nagios3/conf.d
208    # editor switches.cfg
209
210define host {
211    use         generic-host
212    host_name   bb-sw
213    alias       backbone switch
214    address     10.10.0.253
215    parents     gateway
216}
217
218
219And let's create a file for routers:
220
221        # editor routers.cfg
222
223define host {
224    use         generic-host
225    host_name   bb-gw
226    alias       backbone gw
227    address     10.10.0.254
228    parents     bb-sw
229}
230
231Notice the "parents" entry. This must point at a device or devices which are
232also defined somewhere else in the configuration.
233
234From a topology point of view, pcX cannot reach the switch 'bb-sw' if its
235gateway is down; so the parent of bb-sw is gateway.  Similarly, you cannot
236reach bb-gw if bb-sw is down, so the parent of bb-gw is bb-sw.
237
238
239We end up with this relationship from the point of view of Nagios:
240
241    [Nagios]
242       |
243       |
244    gateway   ==>    host-gateway_nagios3.cfg
245       |
246       |
247     bb-sw    ==>    switches.cfg (parent is gateway)
248       |
249       |
250     bb-gw    ==>    routers.cfg (parent is sw)
251
252
253Once you have created these files, validate the config and restart nagios
254(by repeating steps 2a - 2c above) and check the web interface.
255
256Try the "Status Map" option: it gives you a graphical view of the
257parent-child relationships you have just defined.
258
259
2604. Create an entry for the classroom NOC
261
262Open the existing pcs.cfg and add a new entry to the end:
263
264        # editor pcs.cfg
265       
266# Our classroom NOC
267
268define host {
269    use         generic-host
270    host_name   noc
271    alias       Workshop NOC machine
272    address     10.10.0.250
273    parents     bb-sw
274}
275
276
277Question: why is the parent 'bb-sw?'
278
279As usual, validate configuration and restart nagios.
280
281
282PART III
283Configure Service checks for the classroom NOC
284-----------------------------------------------------------------------------
285
2860. Configuring
287
288Now that we have our hardware configured we can start telling Nagios what services to monitor
289on the configured hardware.
290
291The most basic way is to define individual service checks.
292
2931. Edit pcs.cfg and add the following service check near the definition for
294the 'noc' host
295
296    # cd /etc/nagios3/conf.d
297    # editor pcs.cfg
298   
299define service {
300        host_name                       noc
301        service_description             HTTP
302        check_command                   check_http
303        use                             generic-service
304        notification_interval           0
305}
306
307
3082. Validate the config, restart, and via the nagios web interface check that
309the http service is being monitored (go to "service detail" page)
310
311
312However, when you are checking many identical services, this approach
313quickly becomes tedious. For example, you may have many hosts which are
314running an ssh server and you wish to monitor that service. So you create
315a single service definition, and link it to a group of hosts.
316
317
3183. Look inside the file 'services_nagios2.cfg':
319
320    # cat services_nagios2.cfg
321
322... it should include a section like this:
323
324# check that ssh services are running
325define service {
326        hostgroup_name                  ssh-servers
327        service_description             SSH
328        check_command                   check_ssh
329        use                             generic-service
330        notification_interval           0 ; set > 0 if you want to be
331        renotified
332}
333
334
3354. Open the hostgroups file
336
337    # editor hostgroups_nagios2.cfg
338
339    - Find the hostgroup named "ssh-servers". It should look like this:
340
341define hostgroup {
342        hostgroup_name  ssh-servers
343                alias           SSH servers
344                members         localhost
345        }
346
347Change the line which says
348
349                members                 localhost
350
351to
352
353                members                 localhost,noc
354
355 
356Exit and save the file.
357
358
3595. Verify that your changes are OK:
360
361        # nagios3 -v /etc/nagios3/nagios.cfg
362       
363Restart Nagios to see the new service assocation with your host:
364
365        # service nagios3 restart
366
367Click on the "Service Detail" link in the Nagios web interface to see your new entry.
368
369
370PART IV
371Defining more devices
372-----------------------------------------------------------------------------
373
3741. Create entries for some other routers and PCs in the classroom
375
376Now that we have our routers and switches defined it is quite easy to create
377entries for another group's router and PCs.  Think about the parent
378relationships:
379
380                   gw
381                    |
382          +-------------------+
383          |        sw         |
384          +-------------------+
385           |                 |
386        gateway             rtrN
387           |                 |
388     +---+-+-+---+     +---+-+-+---+
389     |   |   |   |     |   |   |   |
390    pcA pcB pcC pcD   pcW pcX pcY pcZ
391
392The parent of one of you neighbour's PCs is THEIR router. The parent of
393their router is the switch.
394
395If you are in doubt: DRAW this on paper!
396
397So: pick a group to monitor - this example assumes you decided to pick
398group 2. Edit routers.cfg to add their router:
399
400define host {
401    use         generic-host
402    host_name   rtr2
403    alias       group 2 router
404    address     rtr2.ws.nsrc.org
405    parents     bb-sw
406}
407
408And edit pcs.cfg to add their PCs:
409
410define host {
411    use         generic-host
412    host_name   pc5
413    alias       pc5 outside interface
414    address     pc5.ws.nsrc.org
415    parents     rtr2
416}
417define host {
418    use         generic-host
419    host_name   pc6
420    alias       pc6 outside interface
421    address     pc6.ws.nsrc.org
422    parents     rtr2
423}
424define host {
425    use         generic-host
426    host_name   pc7
427    alias       pc7 outside interface
428    address     pc7.ws.nsrc.org
429    parents     rtr2
430}
431define host {
432    use         generic-host
433    host_name   pc8
434    alias       pc8 outside interface
435    address     pc8.ws.nsrc.org
436    parents     rtr2
437}
438
439
440You can review the Network Diagram for the class linked off the classroom wiki
441main page.
442
443As before, repeat steps 2a-2c to verify your configuration, correct any
444errors, and activate it.
445
446PART V
447Defining more services
448-----------------------------------------------------------------------------
449
4500. For services, the default normal_check_interval is 5 (minutes) in
451   generic-service_nagios2.cfg. You may wish to change this to 1 to speed up
452   how quickly service issues are detected, at least in the workshop.
453
4541. Determine what services to define for what devices
455
456   - In this particular class we have:
457
458     routers:  running ssh and snmp
459     switches: running telnet and possibly ssh as well as snmp
460     pcs:      All PCs are running ssh and http and should be running snmp
461               The NOC is currently running an snmp daemon
462             
463     So, let's configure Nagios to check for these services for these
464     devices.
465
4662.) Verify that SSH is running on the routers and workshop PCs images
467
468   - In the file services_nagios2.cfg there is already an entry for the SSH
469     service check, so you do not need to create this. Instead, you
470     simply need to re-define the "ssh-servers" entry in the file
471     /etc/nagios3/conf.d/hostgroups_nagios2.cfg. The initial entry in the file
472     looked like:
473
474# A list of your ssh-accessible servers
475define hostgroup {
476        hostgroup_name  ssh-servers
477                alias           SSH servers
478                members         localhost,noc
479        }
480
481     What do you think you should change? Correct, the "members" line. You should
482     add the other group's router and PCs that you defined above. You can
483     also add "bb-sw" and "bb-gw" since they are also running SSH servers.
484
485     The entry will look something like this:
486
487define hostgroup {
488        hostgroup_name  ssh-servers
489                alias           SSH servers
490                members         localhost,rtr2,pc5,pc6,pc7,pc8,bb-sw,bb-gw
491        }
492
493         Note: leave in "localhost" - This is your PC and represents Nagios' network point of
494         view.
495         
496         The "members" entry will be a long line and might wrap on the screen.
497
498    - Once you are done, run the pre-flight check:
499
500    # nagios3 -v /etc/nagios3/nagios.cfg
501
502    If everything looks good, then restart Nagios
503
504    # service nagios3 restart
505
506    and view your changes in the Nagios web interface.
507
5083.) Check that http is running on all the classroom PCs.
509
510    - This is almost identical to the previous exercise.  There is already
511      a hostgroup called 'http-servers' in the hostgroups_nagios2.cfg
512      file, so you just need to add the new router and PCs there as
513      members of the http-servers group.
514
515
516
517PART VI
518Create More Host Groups
519-----------------------------------------------------------------------------
520
5210. In the web view, look at the pages "Hostgroup Overview", "Hostgroup
522   Summary", "Hostgroup Grid". This gives a convenient way to group together
523   hosts which are related (e.g. in the same site, serving the same purpose).
524
5251. Update /etc/nagios3/conf.d/hostgroups_nagios2.cfg
526
527    - For the following exercises it will be very useful if we have created
528      or update the following hostgroups:
529
530      debian-servers
531      routers
532      switches
533 
534      If you edit the file /etc/nagios3/conf.d/hostgroups_nagios2.cfg you
535      will see an entry for debian-servers that just contains localhost.
536      Update this entry to include all the classroom PCs you are monitoring,
537      including the NOC, but not including the routers.
538
539    # editor /etc/nagios3/conf.d/hostgroups_nagios2.cfg
540
541     Update the entry that says:
542
543
544# A list of your Debian GNU/Linux servers
545define hostgroup {
546        hostgroup_name  debian-servers
547                alias           Debian GNU/Linux Servers
548                members         localhost
549        }
550     
551      So that the "members" parameter contains something like this. Use your
552      classroom network diagram to confirm the exact number of machines and names
553      in your workshop.
554
555                members         localhost,noc,pc5,pc6,pc7,pc8
556
557        Be sure that the line wraps and is not on two separate lines. Otherwise
558        you will get an error when you go to restart Nagios. Remember that
559        your own PC is "localhost".
560
561      - Once you have done this, add in two more host groups, one for routers and
562        one for switches. Call these entries "routers" and "switches".
563        Include the routers and switches you are monitoring.
564
565      - When you are done be sure to verify your work and restart Nagios.
566 
5672. Go back to the web interface and look at your new hostgroups.
568
569
570PART VII
571Extended Host Information ("making your graphs pretty")
572-----------------------------------------------------------------------------
573
5741. Update extinfo_nagios2.cfg
575
576    - If you would like to use appropriate icons for your defined hosts in
577      Nagios this is where you do this. We have the three types of devices:
578
579      Cisco routers
580      Cisco switches
581      Ubuntu servers
582
583      There is a fairly large repository of icon images available for you to
584      use located here:
585
586      /usr/share/nagios/htdocs/images/logos/
587
588      these were installed by default as dependent packages of the nagios3
589      package in Ubuntu. In some cases you can find model-specific icons for
590      your hardware, but to make things simpler we will use the following
591      icons for our hardware:
592
593      /usr/share/nagios/htodcs/images/logos/base/debian.*
594      /usr/share/nagios/htdocs/images/logos/cook/router.*
595      /usr/share/nagios/htdocs/images/logos/cook/switch.*
596
597    - The next step is to edit the file /etc/nagios3/conf.d/extinfo_nagios2.cfg
598      and tell nagios what image you would like to use to represent your devices.
599
600    # editor /etc/nagios3/conf.d/extinfo_nagios2.cfg
601
602      Here is what an entry for your routers looks like (there is already an entry
603      for debian-servers that will work as is). Note that the router model (3600)
604      is not all that important. The image used represents a router in general.
605
606define hostextinfo {
607        hostgroup_name   routers
608        icon_image       cook/router.png
609        icon_image_alt   Cisco Routers (3600)
610        vrml_image       router.png
611        statusmap_image  cook/router.gd2
612}
613
614      Now add an entry for your switches. Once you are done check your
615      work and restart Nagios. Take a look at the Status Map in the web interface.
616      It should be much nicer, with real icons instead of question marks.
617
618
619PART VIII
620Create Service Groups
621-----------------------------------------------------------------------------
622
6231. Create service groups for ssh for your group's PCs.
624
625   - The idea is to create groups of services for display; one for the
626     HTTP servers in your own group, and one for the HTTP servers in the
627     other group you are monitoring. To do this create a new file:
628
629   # editor /etc/nagios3/conf.d/servicegroups.cfg
630
631# My group (example is for group 1)
632define servicegroup {
633        servicegroup_name       group1-http
634        alias                   group 1 HTTP services
635        members                 localhost,HTTP,pc2,HTTP,pc3,HTTP,pc4,HTTP
636        }
637
638# Another group (example is for group 2)
639define servicegroup {
640        servicegroup_name       group2-http
641        alias                   group 2 HTTP services
642        members                 pc5,HTTP,pc6,HTTP,pc7,HTTP,pc8,HTTP
643        }
644
645        - Note that "SSH" needs to be uppercase as this is how the service_description is
646          written in the file /etc/nagios3/conf.d/services_nagios2.cfg
647         
648    - Save your changes, verify your work and restart Nagios. Now if you click on
649      the Servicegroup menu items in the Nagios web interface you should see
650      this information grouped together.
651
652    - If you like you can also create service groups for SSH between
653      the groups.
654
655
656PART IX
657Configure Guest Access to the Nagios Web Interface
658-----------------------------------------------------------------------------
659
6601. Edit /etc/nagios3/cgi.cfg to give read-only guest user access to the Nagios
661   web interface.
662
663    - By default Nagios is configured to give full r/w access via the Nagios
664      web interface to the user nagiosadmin. You can change the name of this
665      user, add other users, change how you authenticate users, what users
666      have access to what resources and more via the cgi.cfg file.
667
668    - First, lets create a "guest" user and password in the htpasswd.users
669      file.
670     
671    # htpasswd /etc/nagios3/htpasswd.users guest
672
673      You can use any password you want (or none). A password of "guest" is
674      not a bad choice.
675
676    - Next, edit the file /etc/nagios3/cgi.cfg and look for what type of access
677      has been given to the nagiosadmin user. By default you will see the following
678      directives (note, there are comments between each directive):
679
680      authorized_for_system_information=nagiosadmin
681      authorized_for_configuration_information=nagiosadmin
682      authorized_for_system_commands=nagiosadmin
683      authorized_for_all_services=nagiosadmin
684      authorized_for_all_hosts=nagiosadmin
685      authorized_for_all_service_commands=nagiosadmin
686      authorized_for_all_host_commands=nagiosadmin
687
688      Now let's tell Nagios to allow the "guest" user some access to
689      information via the web interface. You can choose whatever you would
690      like, but what is pretty typical is this:
691
692      authorized_for_system_information=nagiosadmin,guest
693      authorized_for_configuration_information=nagiosadmin,guest
694      authorized_for_system_commands=nagiosadmin
695      authorized_for_all_services=nagiosadmin,guest
696      authorized_for_all_hosts=nagiosadmin,guest
697      authorized_for_all_service_commands=nagiosadmin
698      authorized_for_all_host_commands=nagiosadmin
699
700    - Once you make the changes, save the file cgi.cfg, verify your
701      work and restart Nagios.
702
703    - To see if you can log in as the "guest" user you may need to clear
704      the cookies in your web browser. You will not notice any difference
705      in the web interface. The difference is that a number of items that
706      are available via the web interface (forcing a service/host check,
707      scheduling checks, comments, etc.) will not work for the guest
708      user.
709
710
711OPTIONAL
712--------
713
714You can now look at configuring different plugins for monitoring
715services.
716
717*    As opposed to just checking that a web server is
718     running on the classroom PCs, you could also check that the nagios3
719     service is available, by requesting the /nagios3/ path. This means
720     passing extra options to the check_http plugin.
721
722     For a description of the available options, type this:
723
724      # /usr/lib/nagios/plugins/check_http
725      # /usr/lib/nagios/plugins/check_http --help
726
727     and of course you can browse the online nagios documentation or google
728     for information on check_http. You can even run the plugin by hand to
729     perform a one-shot service check:
730
731     # /usr/lib/nagios/plugins/check_http -H localhost -u /nagios3/
732
733     So the goal is to configure nagios to call check_http in this way.
734
735There is no suitable plugin definition available, so we need to create one.
736
737# editor /etc/nagios-plugins/config/local.cfg
738define command{
739        command_name    check_http_arg
740        command_line    /usr/lib/nagios/plugins/check_http -H '$HOSTADDRESS$' $ARG1$
741        }
742
743# editor /etc/nagios3/conf.d/services_nagios2.cfg
744define service {
745        hostgroup_name                  nagios-servers
746        service_description             NAGIOS
747        check_command                   check_http_arg!-u /nagios3/
748        use                             generic-service
749}
750
751     and of course you'll need to create a hostgroup called nagios-servers (in
752     hostgroups_nagios2.cfg) to link to this service check.
753
754     Once you have done this, check that Nagios warns you about failing
755     authentication (because it's trying to fetch the page without providing
756     the username/password). There's an extra parameter you can pass to
757     check_http_arg to provide that info, see if you can find it.
758
759      WARNING: in the tradition of "Debian Knows Best", their definition of the
760      check_http command in /etc/nagios-plugins/config/http.cfg
761      is *not* the same as that recommended in the nagios3 documentation.
762      It is missing $ARG1$, so any parameters to pass to check_http are
763      ignored. So you might think you are monitoring /nagios3/ but actually
764      you are monitoring root!
765
766     This is why we had to make a new command definition "check_http_arg".
767     You could make a more specific one like "check_nagios", or you could
768     modify the Ubuntu check_http definition to fit the standard usage.
769
770* Check that SNMP is running on the classroom NOC
771
772    - First you will need to add in the appropriate service check for SNMP in the file
773      /etc/nagios3/conf.d/services_nagios2.cfg. This is where Nagios is impressive. There
774      are hundreds, if not thousands, of service checks available via the various Nagios
775      sites on the web. You can see what plugins are installed by Ubuntu in the nagios3
776      package that we've installed by looking in the following directory:
777
778    # ls /usr/lib/nagios/plugins
779
780      As you'll see there is already a check_snmp plugin available to us. If you are
781      interested in the options the plugin takes you can execute the plugin from the
782      command line by typing:
783
784    # /usr/lib/nagios/plugins/check_snmp
785    # /usr/lib/nagios/plugins/check_snmp --help
786
787      to see what options are available, etc. You can use the check_snmp plugin and
788      Nagios to create very complex or specific system checks.
789
790    - Now to see all the various service/host checks that have been created using the
791      check_snmp plugin you can look in /etc/nagios-plugins/config/snmp.cfg. You will
792      see that there are a lot of preconfigured checks using snmp, including:
793
794      snmp_load
795      snmp_cpustats
796      snmp_procname
797      snmp_disk
798      snmp_mem
799      snmp_swap
800      snmp_procs
801      snmp_users
802      snmp_mem2
803      snmp_swap2
804      snmp_mem3
805      snmp_swap3
806      snmp_disk2
807      snmp_tcpopen
808      snmp_tcpstats
809      snmp_bgpstate
810      check_netapp_uptime
811      check_netapp_cupuload
812      check_netapp_numdisks
813      check_compaq_thermalCondition
814     
815      And, even better, you can create additional service checks quite easily.
816      For the case of verifying that snmpd (the SNMP service on Linux) is running we
817      need to ask SNMP a question. If we don't get an answer, then Nagios can assume
818      that the SNMP service is down on that host. When you use service checks such as
819      check_http, check_ssh and check_telnet this is what they are doing as well.
820
821    - In our case, let's create a new service check and call it "check_system". This
822      service check will connect with the specified host, use the private community
823      string we have defined in class and ask a question of snmp on that ask - in this
824      case we'll ask about the System Description, or the OID "sysDescr.0" -
825
826    - To do this start by editing the file /etc/nagios-plugins/config/snmp.cfg:
827
828    # editor /etc/nagios-plugins/config/snmp.cfg
829
830      At the top (or the bottom, your choice) add the following entry to the file:
831
832# 'check_system' command definition
833define command{
834       command_name    check_system
835       command_line    /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -C
836'$ARG1$' -o sysDescr.0
837        }
838     
839      You may wish to copy and paste this vs. trying to type this out.
840
841          Note that "command_line" is a single line. If you copy and paste in joe the line
842          may not wrap properly and you may have to manually add the part:
843         
844                        '$ARG1$' -o sysDescr.0
845                       
846          to the end of the line.
847
848    - Now you need to edit the file /etc/nagios3/conf.d/services_nagios2.cfg and add
849      in this service check. We'll run this check against all our servers in the
850      classroom, or the hostgroup "debian-servers"
851
852    - Edit the file /etc/nagios3/conf.d/services_nagios2.cfg
853
854    # editor /etc/nagios3/conf.d/services_nagios2.cfg
855
856      At the bottom of the file add the following definition:
857
858# check that snmp is up on all servers
859define service {
860        hostgroup_name                  snmp-servers
861        service_description             SNMP
862        check_command                   check_system!xxxxxx
863        use                             generic-service
864        notification_interval           0 ; set > 0 if you want to be renotified
865}
866
867      The "xxxxxx" is the community string previously (or to be) defined in class.
868     
869      Note that we have included our private community string here vs. hard-coding
870      it in the snmp.cfg file earlier. You must change the "xxxxx" to be the snmp
871      community string given in class or this check will not work.
872     
873    - Now we must create the "snmp-servers" group in our hostgroups_nagios2.cfg file.
874      Edit the file /etc/nagios3/conf.d/hostgroups_nagios2.cfg and go to the end of the
875      file. Add in the following hostgroup definition:
876     
877# A list of snmp-enabled devices on which we wish to run the snmp service check
878define hostgroup {
879           hostgroup_name       snmp-servers
880                   alias        snmp servers
881                   members      noc
882          }
883         
884        - Note that for "members" you could, also, add in the switches and routers for
885          group 1 and 2. But, the particular item (MIB) we are checking for "sysDescr.0"
886          may not be available on the switches and/or routers, so the check would then fail.
887
888    - Now verify that your changes are correct and restart Nagios.
889
890    - If you click on the Service Detail menu choice in web interface you should see
891      the SNMP check appear for the noc host.
892     
893    - After we do the SNMP presentation and exercises in class, then you could come
894      back to this exercise and add in all the classroom PCs to the members list in the
895      hostgroups_nagios2.cfg file, snmp-servers hostgroup definition. Remember to list
896      your PC as "localhost".
897
898