Agenda: exercises-nagios-I-III-basic.txt

File exercises-nagios-I-III-basic.txt, 14.4 KB (added by dean, 5 years ago)
Line 
1% Nagios Installation and Configuration
2%
3
4# Introduction
5
6## Goals
7
8* Install and configure Nagios
9
10## Notes
11
12* Commands preceded with "$" imply that you should execute the command as
13  a general user - not as root.
14* Commands preceded with "#" imply that you should be working as root.
15* Commands with more specific command lines (e.g. "rtrX>" or "mysql>")
16  imply that you are executing commands on remote equipment, or within
17  another program.
18
19# Exercises
20
21
22# PART I
23
24## 1. Log in to your virtual machine as the sysadm user.
25
26## 2. Install Nagios Version 3
27
28~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29$ sudo apt-get install nagios3 nagios3-doc
30~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
31
32During installation you will be prompted for the "Nagios web administration password:" - This will be for the Nagios user "nagiosadmin". When prompted enter in the password you are using your sysadm account.
33
34Note: if you have not already done so, you may be asked to configure
35the Postfix Mail Transport Agent during the Nagios installation process.
36Just accept the default "Internet Site".
37
38## 3. See Initial Nagios Configuration
39
40Open a browser, and go to your machine like this:
41
42        http://pcN.ws.nsrc.org/nagios3/
43
44At the login prompt, login as:
45
46~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
47        User Name: nagiosadmin
48        Password:  <CLASS PASSWORD>
49~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50
51Click on the "Hosts" link on the left of the initial Nagios page to see what has
52already been configured.
53
54## 4. Add Routers, PCs and Switches
55
56We will create three files, routers.cfg, switches.cfg and pcs.cfg and make
57entries for the hardware in our classroom.
58
59### 4a. Creating the switches.cfg file
60
61~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
62$ cd /etc/nagios3/conf.d                                (just to be sure)
63$ sudo editor switches.cfg
64~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
65
66In this file add the following entry (COPY and PASTE!):
67
68~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
69define host {
70    use         generic-host
71    host_name   sw
72    alias       Backbone Switch
73    address     10.10.0.253
74}
75~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
76
77
78Save the file and exit.
79
80### 4b. Creating the "routers.cfg" file
81
82We have up to 10 total routers. These are rtr1-rtr9 and gw. And, we have
831 or 2 wireless Access Points (ap1, ap2). We will define entries for some of
84these. If any of these devices do not exist in your workshop, then do not
85include them. Remember, COPY and PASTE!
86
87~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
88$ sudo editor routers.cfg
89~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
90
91~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
92define host {
93    use         generic-host
94    host_name   gw
95    alias       Classrooom Gateway Router
96    address     10.10.0.254
97}
98
99define host {
100    use         generic-host
101    host_name   rtr1
102    alias       Group 1 Gateway Router
103    address     10.10.1.254
104}
105
106define host {
107    use         generic-host
108    host_name   rtr2
109    alias       Group 2 Gateway Router
110    address     10.10.2.254
111}
112~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
113
114*** Note: you do not need to add definitions for all routers now = you can
115always come back and add the rest later! ***
116
117~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
118define host {
119    use         generic-host
120    host_name   ap1
121    alias       Wireless Access Point 1
122    address     10.10.0.251
123}
124
125define host {
126    use         generic-host
127    host_name   ap2
128    alias       Wireless Access Point 2
129    address     10.10.0.252
130}
131~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
132
133
134Now save the file and exit the editor.
135
136
137### 4c. Creating the pcs.cfg File
138
139Now we will create entries for some of the Virtual Machines in our classroom
140Below we give you the first few entries. You should complete the file with as
141many PCs as you wish to add. We recommend that, at least, you add the 4 PCs
142that are members of your group as well as an entry for the classroom NOC, and
143at least one PC from another group (remember to COPY and PASTE!):
144
145~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
146$ sudo editor pcs.cfg
147~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
148
149~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
150define host {
151    use         generic-host
152    host_name   noc
153    alias       Workshop NOC machine
154    address     10.10.0.250
155}
156
157#
158# Group 1
159#
160
161define host {
162    use         generic-host
163    host_name   pc1
164    alias       pc1
165    address     10.10.1.1
166}
167
168define host {
169    use         generic-host
170    host_name   pc2
171    alias       pc2
172    address     10.10.1.2
173}
174
175#
176# Another PC (example only!)
177#
178
179define host {
180    use         generic-host
181    host_name   pc20
182    alias       pc20
183    address     10.10.5.20
184}
185~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
186
187You can save and exit from the file now. You can add more PC entries later.
188
189
190## STEPS 5a - 5c SHOULD BE REPEATED WHENEVER YOU UPDATE THE CONFIGURATION!
191
192   
193### 5a. Verify that your configuration files are OK
194
195~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
196$ sudo nagios3 -v /etc/nagios3/nagios.cfg
197~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
198
199
200You will get some warnings like the ones below. You can ignore them for
201now.
202
203~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
204Checking services...
205        Checked 7 services.
206Checking hosts...
207Warning: Host 'gw' has no services associated with it!
208Warning: Host 'rtr1' has no services associated with it!
209Warning: Host 'rtr2' has no services associated with it!
210
211etc....
212...
213Total Warnings: N
214Total Errors:   0
215~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
216
217Things look okay - No serious problems were detected during the check.
218Nagios is saying that it's unusual to monitor a device just for its
219existence on the network, without also monitoring some service.
220
221
222### 5b. Reload/Restart Nagios
223
224~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
225$ sudo service nagios3 restart
226~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
227
228HINT: You will be doing this a lot. If you do it all on one line, like this,
229then you can use arrow-up and call back the command:
230
231~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
232$ sudo nagios3 -v /etc/nagios3/nagios.cfg && sudo /etc/init.d/nagios3 restart
233~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
234
235The '&&' ensures that the restart only happens if the config is valid.
236
237
238### 5c. Verify via the Web Interface
239
240Go to the web interface (http://pcN.ws.nsrc.org/nagios3) and check that the hosts
241you just added are now visible in the interface. Click on the "Hosts" item on the
242left of the Nagios screen to see this. You may see it in "PENDING" status until the
243check is carried out.
244
245
246## 6. View Status Map
247
248Go to http://pcN.ws.nsrc.org/nagios3
249
250Click on the "Map" item on the left. You should see all your hosts with the Nagios
251process in the middle. The "?" are because we have not told Nagios what type of host
252each items is (router, switch, AP, PC running Linux, etc...)
253
254
255
256# PART II - Configure Service check for the classroom NOC
257
258## 0. Configuring
259
260Now that we have our hardware configured we can start telling Nagios what services to monitor
261on the configured hardware, how to group the hardware in interesting ways, how to group
262services, etc.
263
264## 1. Associate a service check for our classroom NOC
265
266~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
267$ sudo editor hostgroups_nagios2.cfg
268~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
269
270- Find the hostgroup named "ssh-servers". In the members section of the defintion
271      change the line:
272
273~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
274members                 localhost
275~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
276
277to
278
279~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
280members                 localhost,noc
281~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
282
283Exit and save the file.
284
285Verify that your changes are OK:
286
287~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
288$ sudo nagios3 -v /etc/nagios3/nagios.cfg
289~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
290       
291Restart Nagios to see the new service assocation with your host:
292
293~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
294$ sudo service nagios3 restart
295~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
296
297In the Nagios web interface, find the "Services" link (left menu), and click
298on it.
299
300You should be able to find your recent change:
301
302~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
303noc  SSH      PENDING ...
304~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
305
306
307
308# PART III - Defining Services for all PCs
309
310
311Note: The default normal_check_interval is 5 (minutes) for checking services.
312   This is defined in "generic-service_nagios2.cfg". You may wish to change
313   this to 1 (1 minute) to speed up how quickly service issues are detected,
314   at least during this workshop.
315
316## 1. Determine what services to define for what devices
317
318This is a central concept in using Nagios and network monitoring tools
319in general. So far we are simply using ping to verify that physical hosts
320are up on our network and we have started monitoring a single service on
321a single host (your PC). The next step is to decide what services (web
322server, SSH, etc.) you wish to monitor for each host in the classroom.
323
324In this particular class we have:
325
326* routers:  running ssh and snmp
327* switches: running telnet and possibly ssh as well as snmp
328* pcs:      All PCs are running ssh and http and should be running snmp
329            The NOC is currently running an snmp daemon
330             
331So, let's configure Nagios to check for these services on these devices.
332
333## 2. Verify that SSH is running on the routers and workshop PCs images
334
335In the file "services_nagios2.cfg" there is already an entry for the SSH
336service check, so you do not need to create this step. Instead, you
337simply need to re-define the "ssh-servers" entry in the file
338/etc/nagios3/conf.d/hostgroups_nagios2.cfg. The initial entry in the file
339looked like:
340
341~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
342# A list of your ssh-accessible servers
343define hostgroup {
344        hostgroup_name  ssh-servers
345                alias           SSH servers
346                members         localhost
347        }
348~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
349
350What do you think you should change? Correct, the "members" line. You
351should add in entries for all the classroom pcs, routers and  the
352switches that run ssh.  With this information and the network diagram
353you should be able complete this entry.
354
355The entry will look something like this:
356
357~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
358define hostgroup {
359        hostgroup_name  ssh-servers
360                alias           SSH servers
361                members         localhost,pc1,pc2,...,ap1,noc,rtr1,rtr2,...,gw
362        }
363~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
364
365Note: do not remove "localhost" - This is your PC and represents
366Nagios' network point of view. So, for instance, if you are on "pc3"
367you would NOT list "pc3" in the list of all the classroom pcs as
368it is represented by the "localhost" entry.
369
370The "members" entry will be a long line and will likely wrap on the
371screen. If you want to start additional entries on  newline then use
372"\" to indicate a newline like this:
373
374Remember to include all the PCs and routers that you have defined in
375the files "pcs.cfg", "switches.cfg" and "routers.cfg". Only add entries
376from these files (i.e.: don't add "pc8" in your hostgroup list if "pc8"
377isn't defined in "pcs.cfg" as well).
378
379Once you are done, run the pre-flight check and restart Nagios:
380
381~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
382$ sudo nagios3 -v /etc/nagios3/nagios.cfg && sudo /etc/init.d/nagios3 restart
383~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
384
385... and view your changes in the Nagios web interface.
386
387To continue with hostgroups you can add additional groups for later use, such as all our virtual routers. Go ahead and edit the file hostgroups_nagios2.cfg again:
388
389~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
390$ sudo editor hostgroups_nagios2.cfg
391~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
392
393and add the following to the end of the file (COPY and PASTE this):
394
395~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
396# A list of our virtual routers
397
398define hostgroup {
399        hostgroup_name  routers
400                alias           Cisco 7200 Routers
401                members         rtr1,rtr2,...
402        }
403~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
404
405
406Only list the routers you have defined in the "routers.cfg".
407
408Save and exit from the file. Verify that everything is OK:
409
410~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
411$ sudo nagios3 -v /etc/nagios3/nagios.cfg
412~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
413
414If everything looks good, then restart Nagios
415
416~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
417$ sudo service nagios3 restart
418~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
419
420## 3. Check that http is running on all the classroom PCs.
421
422This is almost identical to the previous exercise. Just make the change
423to the HTTP service adding in each PC (no routers or switches). Remember,
424you don't need to add your machine as it is already defined as
425"localhost". Look for this hostgroup in the file hostgroups_nagios2.cfg
426and update the "members" line appropriately.
427     
428If you have questions or are confused please ask an instructor for help.