Agenda: exercises-smokeping.txt

File exercises-smokeping.txt, 12.1 KB (added by regnauld, 8 years ago)
Line 
1Network Management & Monitoring
2
3Smokeping
4
5Notes:
6------
7* Commands preceded with "$" imply that you should execute the command as
8  a general user - not as root.
9* Commands preceded with "#" imply that you should be working as root.
10* Commands with more specific command lines (e.g. "RTR-GW>" or "mysql>")
11  imply that you are executing commands on remote equipment, or within
12  another program.
13
14Exercises
15----------
16
170. Log in to your PC or open a terminal window as the sysadmin user.
18
19Once you are logged in you can continue with these exercises.
20
211. Install Smokeping
22
23    $ sudo apt-get install smokeping
24
25        (Probably already installed on your machines)
26   
27
282. Initial Configuration
29
30    $ cd /etc/smokeping/config.d
31    $ ls -l
32
33    -rwxr-xr-x 1 root root  578 2010-02-26 01:55 Alerts
34    -rwxr-xr-x 1 root root  237 2010-02-26 01:55 Database
35    -rwxr-xr-x 1 root root  413 2010-02-26 05:40 General
36    -rwxr-xr-x 1 root root  271 2010-02-26 01:55 pathnames
37    -rwxr-xr-x 1 root root  859 2010-02-26 01:55 Presentation
38    -rwxr-xr-x 1 root root  116 2010-02-26 01:55 Probes
39    -rwxr-xr-x 1 root root  155 2010-02-26 01:55 Slaves
40    -rwxr-xr-x 1 root root 8990 2010-02-26 06:30 Targets
41 
42    The files you need to touch (at a minimum) are:
43
44    * Alerts
45    * General
46    * Probes
47    * Targets
48
49    Edit Alerts
50
51    $ sudo vi Alerts
52
53    Update the top of the file where it says:
54
55        *** Alerts ***
56        to = alertee@address.somewhere
57        from = smokealert@company.xy
58
59    to include a proper "to" and "from" field for your server.
60    Something like:
61
62        *** Alerts ***
63        to = sysadmin@localhost
64        from = smokeping-alert@localhost
65
66        If you were going to create tickets from Smokeping alerts the "to"
67        address would be an alias for the ticketing system, for example
68        "net@localhost".
69
70        We will do this a bit later.
71
72    Add a new alert for later use:
73
74        +rttbadstart
75        type = rtt
76        # in milliseconds
77        pattern = ==S,==U
78        priority = 1
79        comment = offline at startup
80
81    * "==S, ==U" means "at Startup" and "not Up"
82    * "priority = 1" means if multiple alerts are defined for a host
83      and multiple alerts match one the one with the highest priority
84      is executed.
85
86        Now save the file and exit, then edit the file General:
87
88        $ sudo vi General
89
90         Change the following lines:
91
92            owner
93            contact
94            cgiurl
95            mailhost
96
97        Something like this should work:
98
99        owner    = NOC
100        contact  = sysadmin@localhost
101        cgiurl   = http://localhost/cgi-bin/smokeping.cgi
102        mailhost = localhost
103
104        Now save the file and exit, then edit the file Probes:
105
106        $ sudo vi Probes
107
108        The current entry in Probes is fine, but if you wish to
109        use additional Smokeping checks you can add them in here
110        and you can specify their default behavior. You can do
111        this, as well, in the Targets file if you wish.
112
113        Here is an example of a Probes file that would specify
114        what to use to check for HTTP and DNS latency as well as
115        the FPing probe that is used for ping latency:
116
117        *** Probes ***
118
119        + FPing
120
121        binary = /usr/bin/fping
122
123        + EchoPingHttp
124
125        + DNS
126        binary = /usr/bin/dig
127        pings = 5
128        step = 180
129        lookup = www.nsrc.org
130
131        Go ahead and update your Probes file with this information.
132        Then save the file and exit. And, now let's restart the
133        Smokeping service to verify that no mistakes have been made
134        before going any further:
135
136        $ sudo /etc/init.d/smokeping stop
137        $ sudo /etc/init.d/smokeping start
138
139        You could, also do:
140
141        $ sudo /etc/init.d/smokeping restart
142
143        or
144
145        $ sudo /etc/init.d/smokeping reload
146
147        to reload configuration changes.
148
149    NB! Due to potential problems in the smokeping init script we recommend
150        using:
151
152    $ sudo /etc/init.d/smokeping stop
153    $ sudo /etc/init.d/smokeping start
154
155    ... instead of the "restart" option
156
157
1583. Configure monitoring of devices
159
160        The majority of your time and work configuring Smokeping
161        will be done in the file /etc/smokeping/config.d/Targets.
162       
163        For this class please do the following:
164
165        Use the FPing probe to check:
166
167      - all the student NOC PCs
168      - classroom NOC
169      - switches
170      - routers
171     
172    You can use the classroom Network Diagram on the classroom wiki (http://noc/) to
173    figure out addresses for each item, etc.
174
175        Create some hierarchy to the Smokeping menu for your
176        checks. Such as:
177
178            PCs
179            Routers
180            Switches
181
182        Add a check for HTTP latency for all the classroom PCs.
183        This will mean adding another category, such as:
184
185            HTTP Servers
186
187        If you have time, consider checking some machines that are
188        external to our classroom and the conference (your organization's
189        website, a popular web page, etc...)
190
191        Look at additional Smokeping probes and consider implementing
192        some of them:
193
194        http://oss.oetiker.ch/smokeping/probe/index.en.html
195
196        As trying to explain all syntactical details of how the file
197        /etc/smokeping/config.d/Targets is used would require several
198        pages we will go through some examples in class, and you can
199        refer to the Smokeping configuration files that are in use on
200        the classroom NOC box by going to:
201
202        http://noc/configs/etc/smokeping
203        http://noc/configs/etc/smokeping/config.d
204
205    Review these files and try to do all the suggested steps from above.
206
2074. Add DNS Latency Checks
208
209        You can check either or both internal or external names using
210        the DNS latency probe.
211
212        Add a menu hierarchy for DNS Latency. Check an external address
213        (nsrc.org) and an internal address (noc). This will look something
214        like this (in Targets):
215
216    $ sudo vi /etc/smokeping/config.d/Targets
217
218        ++ DNS
219        probe = DNS
220        menu = External DNS Check
221        title = DNS Latency
222
223        +++ nsrc
224        host = nsrc.org
225
226        +++ noc
227        host = noc.mgmt
228
229    Exit and save your changes to the file Targets.
230
231    Restart Smokeping to see the changes:
232
233    $ sudo /etc/init.d/smokeping stop
234    $ sudo /etc/init.d/smokeping start
235
236
2375. Send Smokeping alerts to our Request Tracker Net queue
238
239    We've already set this up in RT and in /etc/aliases. You just
240    need to point Smokeping alerts to our RT instance. Edit the file
241    Alerts:
242
243    $ sudo vi /etc/smokeping/config.d/Alerts
244
245    And change:
246
247        to = sysadmin@localhost
248
249    to
250
251        to = net@localhost
252
253    Now whenever Smokeping sends an alert email with that alert text
254    will arrive to the Net queue in Request Tracker.
255
256    Next, be sure you have alerts defined for some of your Targets.
257    You can either turn on alerts by defining alerts for a probe in
258    the /etc/smokeping/config.d/Probes file, or by individual Targets
259    entries.
260
261    In our case let's edit the Targets file and turn on alerts for our
262    DNS Latency checks. In addition, if you add a DNS latency check for
263    a host that does not exist, then we can see a ticket being created
264    in RT.
265
266    $ sudo vi /etc/smokeping/config.d/Targets
267
268    Find the following section in the file:
269
270        ++ DNS
271        probe = DNS
272        menu = External DNS Check
273        title = DNS Latency
274
275        +++ nsrc
276        host = nsrc.org
277
278        +++ noc
279        host = noc.mgmt
280
281    And, add the following host after "+++ noc"
282
283        +++ noexist
284        host = does.not.exist
285        alerts = rttbadstart
286
287    Save and exit from the file, then restart smokeping:
288
289    $ sudo /etc/init.d/smokeping stop
290    $ sudo /etc/init.d/smokeping start
291
292    You will see an error message on the screen:
293
294        WARNING: Hostname 'does.not.exist' does currently not resolve to
295        an IPv6 or IPv4 address
296
297    This is to be expected as the host "does.not.exist" is not a valid
298    host. But, Smokeping still starts, and the rttbadstart Alert will
299    now send email to the Net queue for Request Tracker. If you open a
300    web browser to your RT instance:
301
302    http://MyMachine/rt/
303
304    and log in as "sysadmin" you will see a new ticket in the home screen
305    that has a subject of:
306
307    "[SmokeAlert] rttbadstart is active on
308    AROC.DNSProbe.RT-test"
309   
310
3116. MultiHost Graphs
312
313    Once you have defined a group of hosts under a single probe type in your
314    /etc/smokeping/config.d/Targets file, then you can create a single graph
315    that will show you the results of all smokeping tests for all hosts that
316    you define. This has the advantage of letting you quickly compare, for
317    example, a group of hosts that you are monitoring with the FPing probe.
318
319    The MultiHost graph functional in Smokeping is extremely picky - pay close
320        attention.
321
322    To create a MultiHost graph first edit the file Targets:
323
324    $ sudo vi Targets
325
326    If you had a section for the FPing probe defined that looked like this
327    (this is an example only - your Targets file may look different):
328
329        + Local
330        menu = Local
331        title = Local Network
332
333        ++ LocalMachine
334        menu = Local Machine
335        title = This host
336        host = localhost
337
338        ++ pc1
339        menu = pc1
340        title = pc1
341        host = pc1
342
343        ++ pc2
344        menu = pc2
345        title = pc2
346        host = pc2
347
348        ++ pc3
349        menu = pc3
350        title = pc3
351        host = pc3
352
353    Right now smokeping displays the results of the FPing probe for each
354    host defined in separate graphs. If you wish to see the results in a
355    single graph with multiple lines, then you would do this after the last
356    FPing probe host definition:
357
358        + MultiHostPCs
359        menu = MultiHost Ping
360        title = Consolidated Ping Response Time
361        host = /Local/LocalMachine /Local/pc1 /Local/pc2 /Local/pc3
362
363    (Note: if the lines get too long, you can have multiple lines for the
364    "host" entry by using the "\" character to indicate another line - ask about
365    this if you are unsure!)
366
367    Now save and exit the file Targets and restart smokeping:
368
369    $ sudo /etc/init.d/smokeping stop
370    $ sudo /etc/init.d/smokeping start
371
372    You should see a new graph under the "MultiHost Ping" menu in your
373    smokeping web interface. This graph will have different color lines
374    for each host you have defined.
375
376
3777. Slave instances - only done if we have the time.
378
379    This is a description only for informational purposes in case you wish
380    to attempt this type of configuration once the workshop is over.
381
382    The idea behind this is that you can run multiple smokeping instances
383    at multiple locations that are monitoring the same hosts and/or services
384    as your master instance. The slaves will send their results to the
385    master server and you will see these results side-by-side with your
386    local results. This allows you to view how users outside your network
387    see your services and hosts.
388
389    This can be a powerful tool for resolving service and host issues that
390    may be difficult to troubleshoot if you only have local data.
391
392    Graphically this looks this:
393
394          [slave 1]     [slave 2]      [slave 3]
395                |             |              |
396                +-------+     |     +--------+
397                        |     |     |
398                        v     v     v
399                        +---------------+
400                        |    master     |
401                        +---------------+
402
403    You can see example of this data here:
404
405    http://oss.oetiker.ch/smokeping-demo/
406
407    Look at the various graph groups and notice that many of the graphs
408    have multiple lines with the color code chart listing items such as
409    "median RTT from mipsrv01" - These are not MultiHost graphs, but rather
410    graphs with data from external smokeping servers.
411
412    To configure a smokeping master/slave server you can see the documentation
413    here:
414
415    http://oss.oetiker.ch/smokeping/doc/smokeping_master_slave.en.html
416
417    In addition, a sample set of steps for configuring this is available in
418    the file sample-smokeping-master-slave.txt.
419