Agenda: exercises-nagios-I-III-basic.txt

File exercises-nagios-I-III-basic.txt, 13.3 KB (added by b.candler, 6 years ago)
Line 
1Nagios Installation and Configuration
2
3Notes:
4------
5* Commands preceded with "$" imply that you should execute the command as
6  a general user - not as root.
7* Commands preceded with "#" imply that you should be working as root.
8* Commands with more specific command lines (e.g. "RTR-GW>" or "mysql>")
9  imply that you are executing commands on remote equipment, or within
10  another program.
11
12Exercises
13---------
14
15PART I
16----------------
17
180. Log in to your virtual machine as the sysadm user.
19
201. Install Nagios Version 3
21---------------------------
22
23Become the root user:
24
25        $ sudo bash
26        # apt-get install nagios3 nagios3-doc
27
28During installation you will be prompted for the "Nagios web administration password:" - This
29will be for the Nagios user "nagiosadmin". When prompted enter in the password you are using
30your sysadm account.
31
32Note: if you have not already done so, you may be asked to configure
33the Postfix Mail Transport Agent during the Nagios installation process.
34Just accept the default "Internet Site".
35
362. See Initial Nagios Configuration
37------------------------------------
38
39Open a browser, and go to your machine like this:
40
41        http://pcN.ws.nsrc.org/nagios3/
42
43At the login prompt, login as:
44
45        User Name: nagiosadmin
46        Password:  <CLASS PASSWORD>
47
48Click on the "Hosts" link on the left of the initial Nagios page to see what has
49already been configured.
50
513. Enable External commands in nagios.cfg
52-----------------------------------------
53
54This change is required in order to allow users to "Acknowledge" problems with
55hosts and services in the Web interface.
56
57First, edit the file /etc/nagios3/nagios.cfg, and change the line:
58
59        check_external_commands=0
60
61to
62
63        check_external_commands=1
64
65Save the file and exit.
66
67Then, perform the following commands to change directory permissions and
68to make the changes permanent:
69
70/etc/init.d/nagios3 stop
71dpkg-statoverride --update --add nagios www-data 2710 /var/lib/nagios3/rw
72dpkg-statoverride --update --add nagios nagios 751 /var/lib/nagios3
73/etc/init.d/nagios3 start
74
75
764. Update the File hostgroups_nagios2.cfg
77-----------------------------------------
78
79        # cd /etc/nagios3/conf.d
80        # editor hostgroups_nagios2.cfg
81
82Go to the bottom of the file and add the following entry (we STRONGLY encourage you
83to COPY and PASTE!):
84
85
86define hostgroup {
87        hostgroup_name  ping-servers
88                alias           Pingable servers
89                members         rtrX
90        }
91
92Where "rtrX" is the router for your group. That is, if you are in group 1, then
93replace "rtrX" with "rtr1". Now save and exit the from the file.
94
95
965. Add Routers, PCs and Switches
97--------------------------------
98
99We will create three files, routers.cfg, switches.cfg and pcs.cfg and make
100entries for the hardware in our classroom.
101
1026a. Creating the switches.cfg file
103----------------------------------
104
105        # cd /etc/nagios3/conf.d                                (just to be sure)
106        # editor switches.cfg
107
108In this file add the following entry (COPY and PASTE!):
109
110define host {
111    use         generic-host
112    host_name   sw
113    alias       Backbone Switch
114    address     10.10.0.253
115}
116
117Save the file and exit.
118
1196b. Creating the routers.cfg file
120---------------------------------
121
122We have up to 10 total routers. These are rtr1-rtr9 and gw-rtr. And, we have 1 or 2
123wireless Access Points (ap1, ap2). We will define entries for each of these. If any
124of these devices do not exist in your workshop, then do not include them. Remember,
125COPY and PASTE!
126
127        # editor routers.cfg
128
129
130define host {
131    use         generic-host
132    host_name   gw-rtr
133    alias       Classrooom Gateway Router
134    address     10.10.0.254
135}
136
137define host {
138    use         generic-host
139    host_name   rtr1
140    alias       Group 1 Gateway Router
141    address     10.10.1.254
142}
143
144define host {
145    use         generic-host
146    host_name   rtr2
147    alias       Group 2 Gateway Router
148    address     10.10.2.254
149}
150
151define host {
152    use         generic-host
153    host_name   rtr3
154    alias       Group 3 Gateway Router
155    address     10.10.3.254
156}
157
158define host {
159    use         generic-host
160    host_name   rtr4
161    alias       Group 4 Gateway Router
162    address     10.10.4.254
163}
164
165define host {
166    use         generic-host
167    host_name   rtr5
168    alias       Group 5 Gateway Router
169    address     10.10.5.254
170}
171
172define host {
173    use         generic-host
174    host_name   rtr6
175    alias       Group 6 Gateway Router
176    address     10.10.6.254
177}
178
179define host {
180    use         generic-host
181    host_name   rtr7
182    alias       Group 7 Gateway Router
183    address     10.10.7.254
184}
185
186define host {
187    use         generic-host
188    host_name   rtr8
189    alias       Group 8 Gateway Router
190    address     10.10.8.254
191}
192
193define host {
194    use         generic-host
195    host_name   rtr9
196    alias       Group 9 Gateway Router
197    address     10.10.9.254
198}
199
200define host {
201    use         generic-host
202    host_name   ap1
203    alias       Wireless Access Point 1
204    address     10.10.0.251
205}
206
207define host {
208    use         generic-host
209    host_name   ap2
210    alias       Wireless Access Point 2
211    address     10.10.0.252
212}
213
214
215Now save and exit from the file.
216
217
2186c. Creating the pcs.cfg File
219-----------------------------
220
221Now we will create entries for all the Virtual Machines in our classroom. Below
222we give you the first few entries. You should complete the file with as many PCs
223as you wish to add. We recommend that, at least, you add the 4 PCs that are
224members of your group as well as an entry for the classroom NOC, and at least
225one PC from another group (remember to COPY and PASTE!):
226
227        # editor pcs.cfg
228
229
230define host {
231    use         generic-host
232    host_name   noc
233    alias       Workshop NOC machine
234    address     10.10.0.250
235}
236
237#
238# Group 1
239#
240
241define host {
242    use         generic-host
243    host_name   pc1
244    alias       pc1
245    address     10.10.1.1
246}
247
248define host {
249    use         generic-host
250    host_name   pc2
251    alias       pc2
252    address     10.10.1.2
253}
254
255define host {
256    use         generic-host
257    host_name   pc3
258    alias       pc3
259    address     10.10.1.3
260}
261
262define host {
263    use         generic-host
264    host_name   pc4
265    alias       pc4
266    address     10.10.1.4
267}
268
269#
270# Another PC (example only!)
271#
272
273define host {
274    use         generic-host
275    host_name   pc20
276    alias       pc20
277    address     10.10.5.20
278}
279
280You can save and exit from the file now, or you can continue to add more PC entries.
281If you have not added PCs for your group be sure to do that before you exit from the
282file.
283
284
285
286STEPS 7a - 7c SHOULD BE REPEATED WHENEVER YOU UPDATE THE CONFIGURATION!
287=======================================================================
288   
2897a. Verify that your configuration files are OK
290-----------------------------------------------
291
292        # nagios3 -v /etc/nagios3/nagios.cfg
293
294
295    ... You should get some warnings like :
296
297Checking services...
298        Checked 7 services.
299Checking hosts...
300Warning: Host 'gw-rtr' has no services associated with it!
301Warning: Host 'rtr1' has no services associated with it!
302Warning: Host 'rtr2' has no services associated with it!
303
304etc....
305...
306Total Warnings: N
307Total Errors:   0
308
309Things look okay - No serious problems were detected during the check.
310Nagios is saying that it's unusual to monitor a device just for its
311existence on the network, without also monitoring some service.
312
313
3147b. Reload/Restart Nagios
315-------------------------
316
317        # service nagios3 restart
318
319HINT: You will be doing this a lot. If you do it all on one line, like this,
320then you can hit cursor-up and rerun all in one go:
321
322        # nagios3 -v /etc/nagios3/nagios.cfg && /etc/init.d/nagios3 restart
323
324The '&&' ensures that the restart only happens if the config is valid.
325
326
3277c. Verify via the Web Interface
328--------------------------------
329
330Go to the web interface (http://pcN.ws.nsrc.org/nagios3) and check that the hosts
331you just added are now visible in the interface. Click on the "Hosts" item on the
332left of the Nagios screen to see this. You may see it in "PENDING" status until the
333check is carried out.
334
335
3368. View Status Map
337--------------------
338
339Go to http://pcN.ws.nsrc.org/nagios3
340
341Click on the "Map" item on the left. You should see all your hosts with the Nagios
342process in the middle. The "?" are because we have not told Nagios what type of host
343each items is (router, switch, AP, PC running Linux, etc...)
344
345
346
347PART II
348Configure Service check for the classroom NOC
349-----------------------------------------------------------------------------
350
3510. Configuring
352
353Now that we have our hardware configured we can start telling Nagios what services to monitor
354on the configured hardware, how to group the hardware in interesting ways, how to group
355services, etc.
356
3571. Associate a service check for our classroom NOC
358
359    # editor hostgroups_nagios2.cfg
360
361    - Find the hostgroup named "ssh-servers". In the members section of the defintion
362      change the line:
363
364members                 localhost
365
366    to
367
368members                 localhost,noc
369
370Exit and save the file.
371
372Verify that your changes are OK:
373
374        # nagios3 -v /etc/nagios3/nagios.cfg
375       
376Restart Nagios to see the new service assocation with your host:
377
378        # service nagios3 restart
379
380Click on the "Services" link in the Nagios web interface to see your new entry - it should
381say "noc        SSH             PENDING ...".
382
383
384
385PART III
386Defining Services for all PCs
387-----------------------------------------------------------------------------
388
3890. For services, the default normal_check_interval is 5 (minutes) in
390   generic-service_nagios2.cfg. You may wish to change this to 1 to speed up
391   how quickly service issues are detected, at least in the workshop.
392
3931. Determine what services to define for what devices
394
395   - This is core to how you use Nagios and network monitoring tools in
396     general. So far we are simply using ping to verify that physical hosts
397     are up on our network and we have started monitoring a single service on
398     a single host (your PC). The next step is to decide what services you wish
399     to monitor for each host in the classroom.
400
401   - In this particular class we have:
402
403     routers:  running ssh and snmp
404     switches: running telnet and possibly ssh as well as snmp
405     pcs:      All PCs are running ssh and http and should be running snmp
406               The NOC is currently running an snmp daemon
407             
408     So, let's configure Nagios to check for these services for these
409     devices.
410
4112.) Verify that SSH is running on the routers and workshop PCs images
412
413   - In the file services_nagios2.cfg there is already an entry for the SSH
414     service check, so you do not need to create this step. Instead, you
415     simply need to re-define the "ssh-servers" entry in the file
416     /etc/nagios3/conf.d/hostgroups_nagios2.cfg. The initial entry in the file
417     looked like:
418
419# A list of your ssh-accessible servers
420define hostgroup {
421        hostgroup_name  ssh-servers
422                alias           SSH servers
423                members         localhost
424        }
425
426     What do you think you should change? Correct, the "members" line. You should
427     add in entries for all the classroom pcs, routers and  the switches that run ssh.
428     With this information and the network diagram you should be able complete this entry.
429     
430     The entry will look something like this:
431
432define hostgroup {
433        hostgroup_name  ssh-servers
434                alias           SSH servers
435                members         localhost,pc1,pc2,pc3,pc4...,pc36,ap1,noc,rtr1,rtr2
rtr9,gw-rtr
436        }
437
438         Note: leave in "localhost" - This is your PC and represents Nagios' network point of
439         view. So, for instance, if you are on "pc3" you would not include "pc3" in the list
440         of all the classroom pcs as it is represented by the "localhost" entry.
441         
442         The "members" entry will be a long line and will likely wrap on the screen. If you want to
443         start additional entries on  newline then use "\" to indicate a newline like this:
444         
445                members         localhost,pc1,pc2,pc3,pc4,pc5,pc6,pc7,pc8,pc9,pc10,pc11,pc12, \
446                                pc13,pc14...pc36,ap1,noc,rtr1,rtr2,rtr3...rtr9,gw-rtr
447
448         Remember to include all your PCs and all your routers that you have defined. Do not
449         include any entries if they are not already defined in pcs.cfg, switches.cfg or
450         routers.cfg.
451
452    - Once you are done, run the pre-flight check and restart Nagios:
453
454        # nagios3 -v /etc/nagios3/nagios.cfg && /etc/init.d/nagios3 restart
455
456    and view your changes in the Nagios web interface.
457
458To continue with hostgroups you can add additional groups for later use, such as all our virtual
459routers. Go ahead and edit the file hostgroups_nagios2.cfg again:
460
461     # editor hostgroups_nagios2.cfg
462
463and add the following to the end of the file (COPY and PASTE this):
464
465# A list of our virtual routers
466
467define hostgroup {
468        hostgroup_name  routers
469                alias           Cisco 7200 Routers
470                members         rtr1,rtr2,rtr3,rtr4,rtr5,rtr6,rtr7,rtr8,rtr9
471        }
472
473Save and exit from the file. Verify that everything is OK:
474
475    # nagios3 -v /etc/nagios3/nagios.cfg
476
477    If everything looks good, then restart Nagios
478
479    # service nagios3 restart
480
4813.) Check that http is running on all the classroom PCs.
482
483    - This is almost identical to the previous exercise. Just make the change to the
484      HTTP service adding in each PC (no routers or switches). Remember, you don't need
485      to add your machine as it is already defined as "localhost". Look for this hostgroup
486      in the file hostgroups_nagios2.cfg and update the "members" line appropriately.
487     
488      If you have questions or are confused feel free to ask an instructor for help.
489