1 | Network Monitoring and Management |
---|
2 | |
---|
3 | Cacti, Nagios and Smokeping Ticket Creation with Request Tracker |
---|
4 | ---------------------------------------------------------------- |
---|
5 | |
---|
6 | Notes: |
---|
7 | ------ |
---|
8 | * Commands preceded with "$" imply that you should execute the command as |
---|
9 | a general user - not as root. |
---|
10 | * Commands preceded with "#" imply that you should be working as root. |
---|
11 | * Commands with more specific command lines (e.g. "RTR-GW>" or "mysql>") |
---|
12 | imply that you are executing commands on remote equipment, or within |
---|
13 | another program. |
---|
14 | |
---|
15 | Exercises |
---|
16 | --------- |
---|
17 | |
---|
18 | At this point in the week you should have Cacti, Nagios and Smokeping |
---|
19 | installed on your PCs. These exercises show you how to set up each |
---|
20 | of these programs to send alerts to the RT (Request Tracker) ticketing |
---|
21 | system to generate tickets. |
---|
22 | |
---|
23 | |
---|
24 | Exercises Part I |
---|
25 | ---------------- |
---|
26 | |
---|
27 | 0. Log in to your PC or open a terminal window as the sysadm user. |
---|
28 | |
---|
29 | 1. Verify that you have configured rt-mailgate to work with your MTA |
---|
30 | --------------------------------------------------------------------- |
---|
31 | |
---|
32 | Open the file /etc/aliases: |
---|
33 | |
---|
34 | $ sudo editor /etc/aliases |
---|
35 | |
---|
36 | In the file /etc/aliases you should have the following two lines: |
---|
37 | |
---|
38 | net-comment: "|/usr/bin/rt-mailgate --queue net --action comment --url http://localhost/rt/" |
---|
39 | net: "|/usr/bin/rt-mailgate --queue net --action correspond --url http://localhost/rt/" |
---|
40 | |
---|
41 | If these lines are not in /etc/aliases, then be sure to add them. When you are done save |
---|
42 | the file and exit. Then you need to tell the MTA (Mail Transfer Agent) that there are some |
---|
43 | new aliases to be used: |
---|
44 | |
---|
45 | $ sudo newaliases |
---|
46 | |
---|
47 | |
---|
48 | 2. Configure Smokeping |
---|
49 | ---------------------- |
---|
50 | |
---|
51 | In the file: |
---|
52 | |
---|
53 | /etc/smokeping/config.d/Alerts |
---|
54 | |
---|
55 | You can tell Smokeping where alert outputs should go. Edit the file: |
---|
56 | |
---|
57 | $ sudo vi /etc/smokeping/config.d/Alerts |
---|
58 | |
---|
59 | And Update the top of the file to be: |
---|
60 | |
---|
61 | *** Alerts *** |
---|
62 | to = net@localhost |
---|
63 | from = smokealert@localhost |
---|
64 | |
---|
65 | At the end of the file, add another alert like this: |
---|
66 | |
---|
67 | +anydelay |
---|
68 | type = rtt |
---|
69 | # in milliseconds |
---|
70 | pattern = >1 |
---|
71 | comment = Just for testing |
---|
72 | |
---|
73 | Be sure that all text is flush left in the file. |
---|
74 | |
---|
75 | Now exit and save the file. |
---|
76 | |
---|
77 | Notice the pattern in this alert. It means that an alert will be triggered |
---|
78 | as soon as a sample measurement has "ANY" delay, that is, more than one |
---|
79 | millisecond. This is just for testing. In reality, you will want to create |
---|
80 | an alert based on your observed baseline. For example, if your DNS servers' |
---|
81 | delay suddendly goes from under 10 ms to over 100ms. |
---|
82 | |
---|
83 | Next, be sure you have this test alert defined for some of your Targets. |
---|
84 | You can either turn on alerts by defining alerts for a probe in |
---|
85 | the /etc/smokeping/config.d/Probes file, or by individual Targets |
---|
86 | entries. |
---|
87 | |
---|
88 | In our case let's edit the Targets file and turn on alerts for our |
---|
89 | DNS Latency checks. |
---|
90 | |
---|
91 | $ sudo vi /etc/smokeping/config.d/Targets |
---|
92 | |
---|
93 | Find (or add if necessary) the following section in the file: |
---|
94 | |
---|
95 | +DNS |
---|
96 | probe = DNS |
---|
97 | ... |
---|
98 | |
---|
99 | Now let's add an entry for a global DNS server that responds recursively. |
---|
100 | |
---|
101 | ++GoogleA |
---|
102 | menu = 8.8.8.8 |
---|
103 | title = DNS Latency for google-public-dns-a.google.com |
---|
104 | host = google-public-dns-a.google.com |
---|
105 | alerts = anydelay |
---|
106 | |
---|
107 | Notice the line that says, "alerts=anydelay". |
---|
108 | |
---|
109 | So, in summary - you should have in your Targets file the following section near |
---|
110 | the bottom of the file: |
---|
111 | |
---|
112 | +DNS |
---|
113 | probe = DNS |
---|
114 | menu = DNS Latency |
---|
115 | title = DNS Latency Probes |
---|
116 | |
---|
117 | ++GoogleA |
---|
118 | menu = 8.8.8.8 |
---|
119 | title = DNS Latency for google-public-dns-a.google.com |
---|
120 | host = google-public-dns-a.google.com |
---|
121 | alerts = anydelay |
---|
122 | |
---|
123 | (items should be flush left in the file). |
---|
124 | |
---|
125 | Save and exit from the file, then restart smokeping: |
---|
126 | |
---|
127 | $ sudo service smokeping restart |
---|
128 | |
---|
129 | Now check RT to see if you have received anything from Smokeping. It may take up to 5 minutes |
---|
130 | for a new ticket to appear. |
---|
131 | |
---|
132 | NOTE: - If you have not already configured the DNS Latency checks for Smokeping you may need to |
---|
133 | edit the file /etc/smokeping/config.d/Probes and add in the entry for DNS like this: |
---|
134 | |
---|
135 | $ sudo vi /etc/smokeping/config.d/Probes |
---|
136 | |
---|
137 | And, at the bottom of the file add: |
---|
138 | |
---|
139 | + DNS |
---|
140 | binary = /usr/bin/dig |
---|
141 | pings = 5 |
---|
142 | step = 180 |
---|
143 | lookup = www.nsrc.org |
---|
144 | |
---|
145 | Save and exit from the file and restart Smokeping: |
---|
146 | |
---|
147 | $ sudo service smokeping restart |
---|
148 | |
---|
149 | |
---|
150 | 3. Nagios and Request Tracker Ticket Creation |
---|
151 | ---------------------------------------------- |
---|
152 | |
---|
153 | To configure RT and Nagios so that alerts from Nagios automatically |
---|
154 | create tickets requires a few steps: |
---|
155 | |
---|
156 | * Create a proper contact entry for Nagios in |
---|
157 | /etc/nagios3/conf.d/contacts_nagios2.cfg |
---|
158 | |
---|
159 | * Create the proper command in Nagios to use the rt-mailgate |
---|
160 | interface. The command is defined in /etc/nagios3/commands.cfg |
---|
161 | |
---|
162 | These next two items should already be done in RT if you have |
---|
163 | finished the RT exercises. |
---|
164 | |
---|
165 | * Install the rt-mailgate software and configure it properly |
---|
166 | in your /etc/aliases file for your MTA in use. |
---|
167 | |
---|
168 | * Configure the appropriate queues in RT to receive emails |
---|
169 | passed to it from Nagios via the rt-mailgate software. |
---|
170 | |
---|
171 | |
---|
172 | 5. Configure a Contact in Nagios |
---|
173 | --------------------------------- |
---|
174 | |
---|
175 | - Edit the file /etc/nagios3/conf.d/contacts_nagios2.cfg |
---|
176 | |
---|
177 | $ sudo bash |
---|
178 | # vi /etc/nagios3/conf.d/contacts_nagios2.cfg |
---|
179 | |
---|
180 | - In this file we will first add a new contact name under |
---|
181 | the default root contact entry. The new contact should |
---|
182 | look like this: |
---|
183 | |
---|
184 | define contact{ |
---|
185 | contact_name net |
---|
186 | alias RT Alert Queue |
---|
187 | service_notification_period 24x7 |
---|
188 | host_notification_period 24x7 |
---|
189 | service_notification_options c |
---|
190 | host_notification_options d |
---|
191 | service_notification_commands notify-service-ticket-by-email |
---|
192 | host_notification_commands notify-host-ticket-by-email |
---|
193 | email net@localhost |
---|
194 | } |
---|
195 | |
---|
196 | - _DO NOT_ remote the "root" contact_name entry! This entry goes |
---|
197 | below the "root" contact. |
---|
198 | |
---|
199 | - the service_notification_option of "c" means only notify once a |
---|
200 | service is considered "critical" by Nagios (i.e. down). The |
---|
201 | host_notification_option of "d" means down. By specify only "c" |
---|
202 | and "d" this means that notifications will not be sent for other |
---|
203 | states. |
---|
204 | |
---|
205 | - Note the email address in use "net@localhost" - this is important |
---|
206 | as this was previously defined for RT. |
---|
207 | |
---|
208 | - Now we must create a Contact Group that contains this contact. |
---|
209 | We will call this group "tickets." Do this at the end of the file: |
---|
210 | |
---|
211 | define contactgroup{ |
---|
212 | contactgroup_name tickets |
---|
213 | alias email to ticket system for RT |
---|
214 | members net,root |
---|
215 | } |
---|
216 | |
---|
217 | - You could leave off "root" as a member, but we've left this on to |
---|
218 | have another user that receives email to help us troubleshoot if |
---|
219 | there are issues. |
---|
220 | |
---|
221 | - Now that your contact has been created you need to create the commands |
---|
222 | that were referenced in the initial contact creation above, these are |
---|
223 | "notify-service-ticket-by-email" and "notify-host-ticket-by-email" |
---|
224 | |
---|
225 | |
---|
226 | 6. Update Nagios Commands |
---|
227 | ------------------------- |
---|
228 | |
---|
229 | - To create the notify-service-ticket-by-email and notify-host-ticket-by-email |
---|
230 | commands we need to edit the file /etc/nagios3/commands.cfg. |
---|
231 | |
---|
232 | # vi /etc/nagios3/commands.cfg |
---|
233 | |
---|
234 | - In this file you already have two command definitions that we are using. These are |
---|
235 | called notify-host-by-email and notify-service-by-email. We are going to add two |
---|
236 | new commands. |
---|
237 | |
---|
238 | - We _strongly_ suggest that you COPY and PASTE the text below. It is almost impossible |
---|
239 | to type it without errors. |
---|
240 | |
---|
241 | - Put these two new entries _BELOW_ the current notify-host-by-email and notify-service-by-email |
---|
242 | command entries. Do not remove the old one. |
---|
243 | |
---|
244 | - NOTE: The "commands below do not contain breaks. They are a single line. Be aware of this as |
---|
245 | COPY and PASTE between some editors and environments may insert line breaks. |
---|
246 | |
---|
247 | ################################################################ |
---|
248 | # Additional commands created for network management workshop # |
---|
249 | ################################################################ |
---|
250 | |
---|
251 | # 'notifiy-host-ticket-by-email' command definition |
---|
252 | define command{ |
---|
253 | command_name notify-host-ticket-by-email |
---|
254 | command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$ |
---|
255 | } |
---|
256 | |
---|
257 | # 'notify-service-ticket-by-email' command definition |
---|
258 | define command{ |
---|
259 | command_name notify-service-ticket-by-email |
---|
260 | command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$ |
---|
261 | } |
---|
262 | |
---|
263 | |
---|
264 | 7. Choose a Service to Monitor with RT Tickets |
---|
265 | ---------------------------------------------- |
---|
266 | |
---|
267 | |
---|
268 | - The final step is to tell Nagios that you wish to notify the contact "tickets" for a |
---|
269 | particular service. If you look in /etc/nagios3/conf.d/generic-service_nagios2.cfg the |
---|
270 | default contact_groups is "admins". To override this for a service edit the file |
---|
271 | /etc/nagios3/conf.d/services_nagios2.cfg and a contact_groups entry for one of the |
---|
272 | service definitions. |
---|
273 | |
---|
274 | - To send email to generate tickets in RT if HTTP goes down on a box you would edit the |
---|
275 | HTTP service check so that it looks like this: |
---|
276 | |
---|
277 | # check that web services are running |
---|
278 | define service { |
---|
279 | hostgroup_name http-servers |
---|
280 | service_description HTTP |
---|
281 | check_command check_http |
---|
282 | use generic-service |
---|
283 | notification_interval 0 ; set > 0 if you want to be renotified |
---|
284 | contact_groups tickets |
---|
285 | } |
---|
286 | |
---|
287 | Note the additional item that we now have, "contact_groups." You can do this for other |
---|
288 | entries as well if you wish. |
---|
289 | |
---|
290 | - When you are done, save the file and exit. |
---|
291 | |
---|
292 | - Now restart Nagios to verify your changes are correct. |
---|
293 | |
---|
294 | # /etc/init.d/nagios3 stop |
---|
295 | # /etc/init.d/nagios3 start |
---|
296 | |
---|
297 | |
---|
298 | 4.) Generate RT Tickets for Hosts |
---|
299 | --------------------------------- |
---|
300 | |
---|
301 | - To do this you must either specify "contact_groups tickets" for individual host |
---|
302 | definitions, or you must update the template file for all hosts and change the |
---|
303 | default contact_groups entry to tickets. This file is generic-host_nagios2.cfg. |
---|
304 | |
---|
305 | - If you wish to do this go ahead. Tickets will be generated if a host goes down |
---|
306 | and you have specified the contact_groups for that host as being "tickets" |
---|
307 | |
---|
308 | 5. See Nagios Tickets in RT |
---|
309 | --------------------------- |
---|
310 | |
---|
311 | To verify your changes have worked we can be sure to monitor for HTTP one of our |
---|
312 | servers that is not running HTTP. Let's pick the second Mac Mini in our class |
---|
313 | or the box known as "s1.ws.nsrc.org" (see the network diagram for details). |
---|
314 | |
---|
315 | If you do not have an entry for this machine add on to the file where your PCs |
---|
316 | are defined. If this is in a file called pcs.cfg you would do: |
---|
317 | |
---|
318 | # vi /etc/nagios3/conf.d/pcs.cfg |
---|
319 | |
---|
320 | In this file add (or verify you have) an entry that looks like this: |
---|
321 | |
---|
322 | define host { |
---|
323 | use generic-host |
---|
324 | host_name s1 |
---|
325 | alias s1 |
---|
326 | address 10.10.0.241 |
---|
327 | parents sw |
---|
328 | } |
---|
329 | |
---|
330 | Save and exit from the file. |
---|
331 | |
---|
332 | Now edit the file named /etc/nagios3/conf.d/hostgroups_nagios2.cfg and add s2 to the hostgroup |
---|
333 | for HTTP service checks: |
---|
334 | |
---|
335 | # vi /etc/nagios3/conf.d/hostgroups_nagios2.cfg |
---|
336 | |
---|
337 | Look for the "hostgroup_name http-servers" entry and update it so that it looks like this: |
---|
338 | |
---|
339 | |
---|
340 | # A list of your web servers |
---|
341 | define hostgroup { |
---|
342 | hostgroup_name http-servers |
---|
343 | alias HTTP servers |
---|
344 | members localhost,pc1,pc2,pc3,pc4,pc5,pc6,pc7,pc8,pc9,pc10,pc11,pc12, |
---|
345 | pc13,pc14,pc15,pc16,pc17,pc18,pc19,pc20,pc21,pc22,pc23,pc24, |
---|
346 | pc25,pc26,pc28,pc29,pc30,pc31,pc32,pc35,pc37,pc39,s1 |
---|
347 | } |
---|
348 | |
---|
349 | |
---|
350 | _REMEMBER_ that the line with all the "members" must not have any line breaks. Notice that "s1" |
---|
351 | has been entered on the end of the line. |
---|
352 | |
---|
353 | Now save the file and exit and restart Nagios: |
---|
354 | |
---|
355 | # service nagios3 stop |
---|
356 | # service nagios3 start |
---|
357 | |
---|
358 | |
---|
359 | - It will take a while (up to 10 minutes) for Nagios to report that HTTP is |
---|
360 | "critical", but once that happens a new ticket should appear in your RT instance |
---|
361 | in the net queue generated by Nagios. |
---|
362 | |
---|
363 | - Remember to see this go to http://pcX.ws.nsrc.org/rt/ and log in as Username "sysadmin" |
---|
364 | with the password you chose when you created the RT sysadmin account. The new |
---|
365 | ticket should appear in the "10 newest unowned tickets" box in the main log in |
---|
366 | page in RT. |
---|
367 | |
---|
368 | 6. Configure Cacti to send emails to net@localhost to generate tickets in RT |
---|
369 | ---------------------------------------------------------------------------- |
---|
370 | |
---|
371 | If you have not installed the Plugin Architecture for Cacti, then please be sure to |
---|
372 | attempt this exercise last. |
---|
373 | |
---|
374 | You can view how this work by logging in on the Cacti instance running on the noc |
---|
375 | box as this has the Cacti Plugin Architecture installed and the two plugins called, |
---|
376 | "Settings" and "Threshold". |
---|
377 | |
---|
378 | To see how Cacti can generate a ticket first go to: |
---|
379 | |
---|
380 | http://noc.ws.nsrc.org/cacti/ |
---|
381 | |
---|
382 | Log in as "admin" (system password). The do: |
---|
383 | |
---|
384 | * Click on the Console tab (upper-left) |
---|
385 | * Click on "Settings" (lower-left) |
---|
386 | * Click on the "Mail / DNS" tab (upper-right) |
---|
387 | * Verify that the fields for email are properly filled in: |
---|
388 | - Test Email (sysadm or net @ localhost) |
---|
389 | - Mail Services (PHP Mail() Function) |
---|
390 | - From Email Address (cacti@localhost) |
---|
391 | - From Name (Cacti System Monitor) |
---|
392 | - SMTP Hostname (localhost) |
---|
393 | - SMTP Port (25) |
---|
394 | |
---|
395 | Now we need to create a threshold that we'll use to trigger an email that, in turn, will |
---|
396 | create a ticket in RT: |
---|
397 | |
---|
398 | * Click on "Thresholds" (middle-left) |
---|
399 | * Click on the "Add" option (upper-right) |
---|
400 | * Select a Host (localhost, for example) |
---|
401 | * Select a Graph (Processes) |
---|
402 | * Select the Data Source (proc) |
---|
403 | * Click on the "create" button |
---|
404 | |
---|
405 | Now you will be presented with a detailed screen where you can specify what should |
---|
406 | happen if the threshhold is reached. Verify or do the following: |
---|
407 | |
---|
408 | * Threshold Name: Something Descriptive |
---|
409 | * Very that "Threshold Enabled" is checked |
---|
410 | * Threshold Type: High / Low Values (for Processes) |
---|
411 | * High Threshold: 50 (this will cause the threshold to trip) |
---|
412 | * Breach Duration: 5 minutes (this will give us ticket in 5 to 10 minutes) |
---|
413 | * Data Type: Exact Value |
---|
414 | * Re-Alert Cycle: Never |
---|
415 | * Extra Alert Emails: net@localhost,sysadm@localhost |
---|
416 | |
---|
417 | This will send an email to net@localhost within 5 or 10 minutes. This will create a |
---|
418 | new ticket in RT. In addition an email will go to sysadm@localhost. You can view the |
---|
419 | email as the sysadm user by doing: |
---|
420 | |
---|
421 | $ mutt -f /var/mail/sysadm |
---|
422 | |
---|
423 | You can create all types of threshold states that can be tripped, which will result in |
---|
424 | ticket creation. Feel free to play around with the cacti instance on the Noc to create |
---|
425 | new thresholds. You can see if they are working by logging in on the Noc instance of |
---|
426 | Request Tracker (RT) at: |
---|
427 | |
---|
428 | http://noc.ws.nsrc.org/rt/ |
---|
429 | |
---|
430 | Username "sysadm" and password is the class password. |
---|
431 | |
---|
432 | |
---|
433 | +-----+ |
---|
434 | Last update 2jun2011 |
---|
435 | Hervey Allen |
---|