Agenda: exercises-smokeping.txt

File exercises-smokeping.txt, 18.7 KB (added by b.candler, 6 years ago)
Line 
1% Network Management & Monitoring
2% Smokeping
3
4Exercises
5---------
6
7In this exercise you will install Smokeping and get it to monitor various
8devices in the class netework.
9
10Since most of the tasks in this exercise require you to be "root", the
11first thing you should do is to connect to your PC and start a root shell.
12
13    $ sudo bash
14    #
15
16
171. Install Smokeping
18--------------------
19
20    # apt-get install smokeping
21
22Then point your web browser at
23
24    http://pcN.ws.nsrc.org/cgi-bin/smokeping.cgi
25
26(replace "pcN" with your own PC) to check that it is running.
27
28
292. Initial Configuration
30------------------------
31
32    # cd /etc/smokeping/config.d
33    # ls -l
34
35    -rwxr-xr-x 1 root root  578 2010-02-26 01:55 Alerts
36    -rwxr-xr-x 1 root root  237 2010-02-26 01:55 Database
37    -rwxr-xr-x 1 root root  413 2010-02-26 05:40 General
38    -rwxr-xr-x 1 root root  271 2010-02-26 01:55 pathnames
39    -rwxr-xr-x 1 root root  859 2010-02-26 01:55 Presentation
40    -rwxr-xr-x 1 root root  116 2010-02-26 01:55 Probes
41    -rwxr-xr-x 1 root root  155 2010-02-26 01:55 Slaves
42    -rwxr-xr-x 1 root root 8990 2010-02-26 06:30 Targets
43
44The files that you'll need to change, at a minimum, are:
45
46* Alerts
47* General
48* Probes
49* Targets
50
51Now open the General file (note the first capital letter)
52 
53    # editor General
54
55Change the following lines:
56
57~~~~
58owner    = NOC
59contact  = sysadm@localhost
60mailhost = localhost
61cgiurl   = http://localhost/cgi-bin/smokeping.cgi
62# specify this to get syslog logging
63syslogfacility = local5
64~~~~
65
66Save the file and exit. Now let's restart the Smokeping service to verify
67that no mistakes have been made before going any further:
68
69    # service smokeping stop
70    # service smokeping start
71
72A quicker way to do this is:
73
74    # service smokeping restart
75               
76We'll use this for the rest of the exercises, or we'll just use the "reload"
77directive as this is all you need for Smokeping to see configuration file
78changes.
79
80Now open the Alerts file (note the first capital letter)
81
82    # editor Alerts
83               
84Change the following lines:
85
86~~~~
87to = root@localhost
88from = smokeping-alert@localhost
89~~~~
90               
91Save the file and exit. Restart Smokeping:
92
93    # service smokeping reload
94
95
963. Configure monitoring of devices
97----------------------------------
98
99The majority of your time and work configuring Smokeping will be done in the
100file /etc/smokeping/config.d/Targets.
101       
102For this class please do the following:
103
104Use the default FPing probe to check:
105
106- some of the student PCs
107- classroom NOC
108- switches
109- routers
110
111You can use the classroom Network Diagram on the classroom wiki to figure out addresses
112for each item, etc.
113
114Create some hierarchy to the Smokeping menu for your checks. For example, the Targets
115file is already partially preconfigured. To start we are going to add some entries to
116this file. Start with:
117
118    # cd /etc/smokeping/config.d
119    # editor Targets
120
121You can take the section from `*** Targets ***` to the end of the LocalMachine and make it
122look something like this. Feel free to use your own "remark", "menu" text and titles. Note
123that we remove the commented lines `#parents = owner:/Test/James location:/`, and the "Alerts"
124line.
125
126NOTE: We strongly recommend that you COPY and PASTE text from these exercises directly in to the
127Targets file. Typing all this by hand will take too long.
128
129~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
130*** Targets ***
131
132probe = FPing
133
134menu = Top
135title = Network Latency Grapher
136remark = Smokeping Latency Grapher for Network Monitoring \
137         and Management Workshop.
138
139+Local
140
141menu = Local Network Monitoring and Management
142title = Local Network
143
144++LocalMachine
145
146menu = Local Machine
147title = This host
148host = localhost
149~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
150
151Now, below the "localhost" we start with the configuration of items for our class.
152We can start simple and add just the first 4 PCs that are in Group 1 as well as an
153entry for our classroom NOC.
154
155~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
156#
157# ********* Classroom Servers **********
158#
159
160+Servers
161
162menu = Servers
163title = Network Management Servers
164
165++noc
166
167menu = noc
168title = Workshop NOC
169host = noc.ws.nsrc.org
170
171#
172# ******** Student Machines (VMs) ***********
173#
174
175+PCs
176
177menu = Lab PCs
178title = Virtual PCs Network Management
179
180++pc1
181
182menu = pc1
183title = Virtual Machine 1
184host = pc1.ws.nsrc.org
185
186
187++pc2
188
189menu = pc2
190title = Virtual Machine 2
191host = pc2.ws.nsrc.org
192
193
194++pc3
195
196menu = pc3
197title = Virtual Machine 3
198host = pc3.ws.nsrc.org
199
200
201++pc4
202
203menu = pc4
204title = Virtual Machine 4
205host = pc4.ws.nsrc.org
206~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
207
208
209OK. Let's see if we can get Smokeping to stop and start with the changes we have
210made, so far. Save and exit from the Targets file. Now try doing:
211
212    # service smokeping reload
213
214If you see error messages, then read them closely and try to correct the problem
215in the Targets file. In addition, Smokeping is now sending log message to the file
216/var/log/syslog. You can view what Smokeping is saying by typing:
217
218    # tail /var/log/syslog
219
220If you want to see all smokeping related messages in the file /var/log/syslog you
221can do this:
222
223    # grep smokeping /var/log/syslog
224
225If there are no errors you can view the results of your changes by going to:
226
227    http://pcN.ws.nsrc.org/cgi-bin/smokeping.cgi
228
229When you are ready you can edit the Targets file again and continue to add machines.
230At the bottom of the file you can add the next group of PCs:
231
232~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
233++pc5
234
235menu = pc5
236title = Virtual Machine 5
237host = pc5.ws.nsrc.org
238
239
240++pc6
241
242menu = pc6
243title = Virtual Machine 6
244host = pc6.ws.nsrc.org
245
246
247++pc7
248
249menu = pc7
250title = Virtual Machine 7
251host = pc7.ws.nsrc.org
252
253
254++pc8
255
256menu = pc8
257title = Virtual Machine 8
258host = pc8.ws.nsrc.org
259~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
260
261
262Add as many PCs as you want, then Save and exit from the Targets file and verify
263that the changes you have made are working:
264
265    # service smokeping reload
266
267You can continue to view the updated results of your changes on the Smokeping
268web page. It may take up to 5 minutes before graphs begin to appear.
269
270    http://pcN.ws.nsrc.org/cgi-bin/smokeping.cgi
271
272
273
2744. Configure monitoring of routers and switches
275-----------------------------------------------
276
277Once you have configured as many PCs as you want to configure, then it's time to
278add in some entries for the classroom routers and switch(es).
279
280    # cd /etc/smokeping/config.d                (just to be sure :-))
281    # editor Targets
282
283Go to the bottom of the file and add in some entries for routers and switches:
284
285
286~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
287#
288# ********** Classroom Backbone Switch *********
289#
290
291+Switches
292
293menu = Switches
294title = Switches Network Management
295
296++sw
297
298menu = sw
299title = Backbone Switch
300host = sw.ws.nsrc.org
301
302#
303# ********** Virtual Routers: Cisco 7200 images *********
304#
305
306+Routers
307
308menu = Routers
309title = Virtual and Physical Routers Network Management
310
311++gw
312
313menu = rtr
314title = Gateway Router
315host = rtr.ws.nsrc.org
316
317++router1
318
319menu = router1
320title = Virtual Router 1
321host = rtr1.ws.nsrc.org
322
323++router2
324
325menu = router2
326title = Virtual Router 2
327host = rtr2.ws.nsrc.org
328
329++router3
330
331menu = router3
332title = Virtual Router 3
333host = rtr3.ws.nsrc.org
334~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
335
336
337If you wish you can continue and add in entries for routers 4 to 6, or up to 9 if there are
338that many in your class. When you are ready Save and Exit from the Targets file and verify
339your work:
340
341    # service smokeping reload
342
343If you want you might consider adding the Wireless Access Point:
344
345    # editor Targets
346
347
348~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
349#
350# Classrom Wireless Access Point
351#
352
353++ap1
354
355menu = ap1
356title = Wireless Access Point 1
357host = ap1.ws.nsrc.org
358~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
359
360Save and Exit from the file and reload the Smokeping service:
361
362    # service smokeping reload
363
364
3655. Add new probes to Smokeping
366------------------------------
367
368The current entry in the Probes file is fine, but if you wish to use additional
369Smokeping checks you can add them in here and you can specify their default
370behavior. You can do this, as well, in the Targets file if you wish.
371
372To add a probe to check for HTTP latency as well as DNS lookup latency,
373edit the Probes file and add the following text TO THE END of that file:
374
375
376~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
377+ EchoPingHttp
378
379+ DNS
380binary = /usr/bin/dig
381pings = 5
382step = 180
383lookup = www.nsrc.org
384~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
385
386
387The DNS probe will look up the IP address of www.nsrc.org using any other open
388DNS server (resolver) you specify in the Targets file. You will see this a bit
389futher on in the exercises.
390
391Now Save and exit from the file and verify that your changes are working:
392
393    # service smokeping reload
394 
395
396
3976. Add HTTP latency checks for the classroom PCs
398------------------------------------------------
399
400Edit the Targets file again and go to the end of the file:
401
402        # editor Targets
403
404At the end of the file add:
405
406
407~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
408#
409# Local Web server response
410#
411
412+HTTP
413
414menu = Local HTTP Response
415title = HTTP Response Student PCs
416
417++pc1
418
419menu = pc1
420title = pc1 HTTP response time
421probe = EchoPingHttp
422host = pc1.ws.nsrc.org
423
424++pc2
425
426menu = pc2
427title = pc2 HTTP response time
428probe = EchoPingHttp
429host = pc2.ws.nsrc.org
430
431++pc3
432
433menu = pc3
434title = pc3 HTTP response time
435probe = EchoPingHttp
436host = pc3.ws.nsrc.org
437
438++pc4
439
440menu = pc4
441title = pc1 HTTP response time
442probe = EchoPingHttp
443host = pc4.ws.nsrc.org
444~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
445
446
447You could actually just use the "probe = EchoPingHttp" statement once for pc1,
448and then this would be the default probe until another "probe = " statement is
449seen in the Targets file.
450
451You can add more PC entries if you wish, or you could consider checking the
452latency on remote machines - these are likely to be more interesting. Machines
453such as your own publicly accessible servers are a good choice, or, perhaps other
454web servers you use often (Google, Yahoo, Government pages, stores, etc.?).
455
456For example, consider adding something like this at the bottom of the Targets file:
457
458~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
459#
460# Remote Web server response
461#
462
463+HTTPRemote
464
465menu = Remote HTTP Response
466title = HTTP Response Remote Machines
467
468++google
469
470menu = Google
471title = Google.com HTTP response time
472probe = EchoPingHttp
473host = www.google.com
474
475++nsrc
476
477menu = Network Startup Resource Center
478title = nsrc.org HTTP response time
479probe = EchoPingHttp
480host = nsrc.org
481~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
482
483Add your own hosts that you use at your organization to the list of Remote Web Servers.
484
485Once you are done, save and exit from the Targets file and verify your work:
486
487    # service smokeping reload
488               
489       
490       
491
4927. Add DNS latency checks
493-------------------------
494
495At the end of the Targets file we are going to add some entries to verify the
496latency from our location to remote recursive DNS servers to look up an entry
497for nsrc.org. You would likely substitue an important address for your institution
498in the Probes file instead. In addition, you can change the address you are looking
499up inside the Targets file as well. For more information see:
500
501<http://oss.oetiker.ch/smokeping/probe/DNS.en.html>
502
503and
504
505<http://oss.oetiker.ch/smokeping/probe/index.en.html>
506
507Now edit the Targets file again. Be sure to go to the end of the file:
508
509    # cd /etc/smokeping/config.d                        (just to be sure...)
510    # editor Targets
511
512At the end of the file add:
513
514
515~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
516#
517# Sample DNS probe
518#
519
520+DNS
521
522probe = DNS
523menu = DNS Latency
524title = DNS Latency Probes
525
526++LocalDNS1
527menu = 10.10.0.250
528title =  DNS Delay for local DNS Server on ns1.ws.nsrc.org
529host = ns1.ws.nsrc.org
530
531++GoogleA
532menu = 8.8.8.8
533title = DNS Latency for google-public-dns-a.google.com
534host = google-public-dns-a.google.com
535
536++GoogleB
537
538menu = 8.8.8.4
539title = DNS Latency for google-public-dns-b.google.com
540host = google-public-dns-b.google.com
541
542++OpenDNSA
543
544menu = 208.67.222.222
545title = DNS Latency for resolver1.opendns.com
546host = resolver1.opendns.com
547
548++OpenDNSB
549
550menu = 208.67.220.220
551title = DNS Latency for resolver2.opendns.com
552host = resolver2.opendns.com
553~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
554
555
556Now save the Targets file and exit and verify your work:
557
558    # service smokeping reload
559
560Look at additional Smokeping probes and consider implementing some of
561them if they are useful to your ogranization:
562
563<http://oss.oetiker.ch/smokeping/probe/index.en.html>
564
565
566
5678. MultiHost graphing
568---------------------
569
570Once you have defined a group of hosts under a single probe type in your
571/etc/smokeping/config.d/Targets file, then you can create a single graph
572that will show you the results of all smokeping tests for all hosts that
573you define. This has the advantage of letting you quickly compare, for
574example, a group of hosts that you are monitoring with the FPing probe.
575
576The MultiHost graph function in Smokeping is extremely picky - pay close
577attention!
578
579To create a MultiHost graph first edit the file Targets:
580
581    # editor Targets
582
583We will create a MultiHost graph for the DNS Latency probes we just added.
584To do this go to the end of the Targets file and add:
585
586
587~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
588#
589# Multihost Graph of all DNS latency checks
590#
591
592++MultiHostDNS
593
594menu = MultiHost DNS
595title = Consolidated DNS Responses
596host = /DNS/LocalDNS1 /DNS/GoogleA /DNS/GoogleB /DNS/OpenDNSA /DNS/OpenDNSB
597~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
598
599And, as always, save and exit from the file Targets and test your new configuration.
600
601
602    # service smokeping reload
603
604
605If this fails you almost certainly have an error in the entries. If you cannot figure
606out what the error is (remember to try "tail /var/log/syslog" first!) ask your instructor
607for some help.
608
609You can add MultiHost graphs for any other set of probe tests (FPing, EchoPingHttp)
610that you have configured. You must add the MultiHost entry at the end of a probe section.
611If you don't understand how this works you can ask your instructors for help.
612
613In addition, on the workshop NOC there are sample configuration files available, including
614one for SmokePing that includes multiple MultiHost graph examples.
615
616
6179. Send Smokeping alerts
618------------------------
619
620If you wish to receive an email when an alert condition is met on one of the
621Smokeping checks first do this:
622
623    # cd /etc/smokeping/config.d
624    # editor Alerts
625
626Update the top of the file where it says:
627
628    *** Alerts ***
629    to = alertee@address.somewhere
630    from = smokealert@company.xy
631
632to include a proper "to" and "from" field for your server. Something like:
633
634    *** Alerts ***
635    to = sysadm@localhost
636    from = smokeping-alert@localhost
637
638Now you must update your device entries to include a line that reads:
639
640    alerts = alertName1, alertName2, etc, etc...
641
642For instance, the alert named, "someloss" has already been defined in the file Alerts:
643
644To read about Smokeping alerts and what they are detecting, how to create your own, etc. see:
645
646<http://oss.oetiker.ch/smokeping/doc/smokeping_config.en.html>
647
648and at the bottom of the page is a section titled `*** Alerts ***`
649
650To place some alert detection on some of your hosts open the file Targets:
651
652    # editor Targets
653
654and go near the start of the file where we defined our PCs. Just under the "host =" line add
655another line that looks like this:
656
657    alerts = someloss
658
659So, for example, the pc1 entry would not look like this:
660
661~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
662++pc1
663
664menu = pc1
665title = Virtual Machine 1
666host = pc1.ws.nsrc.org
667alerts = someloss
668~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
669
670If you want to add an alerts option to other hosts go ahead. Once you are done save and
671exit from the Targets file and then verify that your configuration works:
672
673    # service smokeping reload
674
675If any of the hosts that have the "alerts = " option set meet the conditions to set off the
676alert, then an email will arrive to the sysadm user's mailbox on the Smokeping server
677machine (localhost). It's not likely that an alert will be set off for most machines. To
678check you can read the email for the sysadm user by using an email client like "mutt" -
679
680    # apt-get install mutt
681    # su - sysadm                               (changes you to the sysadm user from root)
682    $ mutt
683
684Say yes to mailbox creation when prompted, then see if you have email from the
685smokeping-alerts@localhost user. You probably will not. To exit from Mutt press "q".
686
687To leave the sysadm user shell type:
688
689    $ exit
690    #
691
692
69310. Slave instances - Informational Only
694----------------------------------------
695
696This is a description only for informational purposes in case you wish
697to attempt this type of configuration once the workshop is over.
698
699The idea behind this is that you can run multiple smokeping instances
700at multiple locations that are monitoring the same hosts and/or services
701as your master instance. The slaves will send their results to the
702master server and you will see these results side-by-side with your
703local results. This allows you to view how users outside your network
704see your services and hosts.
705
706This can be a powerful tool for resolving service and host issues that
707may be difficult to troubleshoot if you only have local data.
708
709Graphically this looks this:
710
711~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
712
713          [slave 1]     [slave 2]      [slave 3]
714                |             |              |
715                +-------+     |     +--------+
716                        |     |     |
717                        v     v     v
718                        +---------------+
719                        |    master     |
720                        +---------------+
721
722~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
723
724You can see example of this data here:
725
726<http://oss.oetiker.ch/smokeping-demo/>
727
728Look at the various graph groups and notice that many of the graphs
729have multiple lines with the color code chart listing items such as
730"median RTT from mipsrv01" - These are not MultiHost graphs, but rather
731graphs with data from external smokeping servers.
732
733To configure a smokeping master/slave server you can see the documentation
734here:
735
736<http://oss.oetiker.ch/smokeping/doc/smokeping_master_slave.en.html>
737
738In addition, a sample set of steps for configuring this is available in
739the file sample-smokeping-master-slave.txt which should be listed as an
740additional reference at the bottom of the Agenda page on your classroom wiki.
741