Network Management & Monitoring
Smokeping
Notes:
------
* Commands preceded with "$" imply that you should execute the command as
a general user - not as root.
* Commands preceded with "#" imply that you should be working as root.
* Commands with more specific command lines (e.g. "RTR-GW>" or "mysql>")
imply that you are executing commands on remote equipment, or within
another program.
Exercises
----------
0. Log in to your PC or open a terminal window as the sysadmn user.
Once you are logged in you can continue with these exercises.
1. Install Smokeping
$ sudo apt-get install smokeping
2. Initial Configuration
$ cd /etc/smokeping/config.d
$ ls -l
-rwxr-xr-x 1 root root 578 2010-02-26 01:55 Alerts
-rwxr-xr-x 1 root root 237 2010-02-26 01:55 Database
-rwxr-xr-x 1 root root 413 2010-02-26 05:40 General
-rwxr-xr-x 1 root root 271 2010-02-26 01:55 pathnames
-rwxr-xr-x 1 root root 859 2010-02-26 01:55 Presentation
-rwxr-xr-x 1 root root 116 2010-02-26 01:55 Probes
-rwxr-xr-x 1 root root 155 2010-02-26 01:55 Slaves
-rwxr-xr-x 1 root root 8990 2010-02-26 06:30 Targets
$ sudo vi General
Change the following lines:
owner = NOC
contact = sysadm@localhost
cgiurl = http://localhost/cgi-bin/smokeping.cgi
mailhost = localhost
Save the file and exit. Now let's restart the
Smokeping service to verify that no mistakes have been made
before going any further:
$ sudo service smokeping restart
2. Configure monitoring of devices
The majority of your time and work configuring Smokeping
will be done in the file /etc/smokeping/config.d/Targets.
For this class please do the following:
Use the default FPing probe to check:
- all the student PCs
- classroom NOC
- switches
- routers
You can use the classroom Network Diagram on the classroom wiki to
figure out addresses for each item, etc.
Please note - a complete set of configuration files is available on
your classroom wiki by going to the "Configuration Files" link on the
main page of the wiki (http://noc.ws.nsrc.org/wiki/).
Create some hierarchy to the Smokeping menu for your
checks. Such as:
+ PCs
++ pc1
menu = pc1
title = pc1
host = pc1
++ pc2
menu = pc2
title = pc2
host = pc2
Save the file and restart Smokeping:
$ sudo ser
Go to your browser and check the Smokeping page:
http://pcN.ws.nsrc.org/cgi-bin/smokeping.cgi
If everything is looking OK, continue adding:
+ Routers
++ gw
menu = gw
title = gw
menu = gw
++ r1
menu = r1
title = r1
host = r1
+ Classroom Switch
++ sw
menu = sw
title = sw
menu = sw
...
Save the file, restart smokeping, and check
your browser again.
3. Add new probes
The current entry in Probes is fine, but if you wish to
use additional Smokeping checks you can add them in here
and you can specify their default behavior. You can do
this, as well, in the Targets file if you wish.
Here is an example of a Probes file that would specify
what to use to check for HTTP and DNS latency as well as
the FPing probe that is used for ping latency:
$ sudo vi Probes
*** Probes ***
+ FPing
binary = /usr/bin/fping
+ EchoPingHttp
+ DNS
binary = /usr/bin/dig
pings = 5
step = 180
lookup = www.nsrc.org
Save the file.
4. Add HTTP latency checks
Now edit your Targets again:
$ sudo vi Targets
Add a check for HTTP latency for all the classroom PCs.
This will mean adding another category, such as:
+ HTTP Servers
probe = EchoPingHttp
++ PC1
host = pc1
++ PC2
host = pc2
...
If you have time, consider checking some machines that are
external to our classroom and the conference (your organization's
website, a popular web page, etc...)
5. Add DNS Latency Checks
You can check either or both internal or external names using
the DNS latency probe.
Add a menu hierarchy for DNS Latency. Check an external address
(nsrc.org) and an internal address (noc). This will look something
like this (in Targets):
+ DNS
probe = DNS
menu = External DNS Check
title = DNS Latency
++ nsrc
host = nsrc.org
++ noc
host = noc.mgmt
Exit and save your changes to the file Targets.
Restart Smokeping to see the changes:
$ sudo service smokeping restart
Look at additional Smokeping probes and consider implementing
some of them:
http://oss.oetiker.ch/smokeping/probe/index.en.html
As trying to explain all syntactical details of how the file
/etc/smokeping/config.d/Targets is used would require several
pages we will go through some examples in class, and you can
refer to the Smokeping configuration files that are in use on
the classroom NOC box by going to:
http://noc/configs/etc/smokeping
http://noc/configs/etc/smokeping/config.d
6. Send Smokeping alerts (SKIP THIS EXERCISE FOR TRACK 2)
$ sudo vi Alerts
Update the top of the file where it says:
*** Alerts ***
to = alertee@address.somewhere
from = smokealert@company.xy
to include a proper "to" and "from" field for your server.
Something like:
*** Alerts ***
to = sysadm@localhost
from = smokeping-alert@localhost
If you have installed RT, you can instead send your alerts
to an existing RT queue:
*** Alerts ***
to = net@localhost
At the end of the file, add another alert like this:
+anydelay
type = rtt
# in milliseconds
pattern = >1
comment = Just for testing
Notice the pattern in this alert. It means that an alert will be triggered
as soon as a sample measurement has "ANY" delay, that is, more than one
millisecond. This is just for testing. In reality, you will want to create
an alert based on your observed baseline. For example, if your DNS servers'
delay suddendly goes from under 10 ms to over 100ms.
Next, be sure you have this test alert defined for some of your Targets.
You can either turn on alerts by defining alerts for a probe in
the /etc/smokeping/config.d/Probes file, or by individual Targets
entries.
In our case let's edit the Targets file and turn on alerts for our
DNS Latency checks.
$ sudo vi /etc/smokeping/config.d/Targets
Find the following section in the file:
+ DNS
probe = DNS
menu = External DNS Check
title = DNS Latency
++ nsrc
host = nsrc.org
And add the following alerts line after "+++ nsrc"
+++ nsrc
host = nsrc
alerts = anydelay
Save and exit from the file, then restart smokeping:
$ sudo service smokeping restart
Check your e-mail with mutt
$ mutt
(or check your RT queues)
And see if you have received alerts after 5 minutes.
6. MultiHost Graphs
Once you have defined a group of hosts under a single probe type in your
/etc/smokeping/config.d/Targets file, then you can create a single graph
that will show you the results of all smokeping tests for all hosts that
you define. This has the advantage of letting you quickly compare, for
example, a group of hosts that you are monitoring with the FPing probe.
The MultiHost graph function in Smokeping is extremely picky - pay close
attention.
To create a MultiHost graph first edit the file Targets:
$ sudo vi Targets
If you had a section for the FPing probe defined that looked like this
(this is an example only - your Targets file may look different):
+ Local
menu = Local
title = Local Network
++ LocalMachine
menu = Local Machine
title = This host
host = localhost
++ pc1
menu = pc1
title = pc1
host = pc1
++ pc2
menu = pc2
title = pc2
host = pc2
++ pc3
menu = pc3
title = pc3
host = pc3
Right now smokeping displays the results of the FPing probe for each
host defined in separate graphs. If you wish to see the results in a
single graph with multiple lines, then you would do this after the last
FPing probe host definition:
+ MultiHostPCs
menu = MultiHost Ping
title = Consolidated Ping Response Time
host = /Local/LocalMachine /Local/pc1 /Local/pc2 /Local/pc3
(Note: if the lines get too long, you can have multiple lines for the
"host" entry by using the "\" character to indicate another line - ask about
this if you are unsure!)
Now save and exit the file Targets and restart smokeping:
$ sudo service smokeping restart
You should see a new graph under the "MultiHost Ping" menu in your
smokeping web interface. This graph will have different color lines
for each host you have defined.
7. Slave instances - only done if we have the time.
This is a description only for informational purposes in case you wish
to attempt this type of configuration once the workshop is over.
The idea behind this is that you can run multiple smokeping instances
at multiple locations that are monitoring the same hosts and/or services
as your master instance. The slaves will send their results to the
master server and you will see these results side-by-side with your
local results. This allows you to view how users outside your network
see your services and hosts.
This can be a powerful tool for resolving service and host issues that
may be difficult to troubleshoot if you only have local data.
Graphically this looks this:
[slave 1] [slave 2] [slave 3]
| | |
+-------+ | +--------+
| | |
v v v
+---------------+
| master |
+---------------+
You can see example of this data here:
http://oss.oetiker.ch/smokeping-demo/
Look at the various graph groups and notice that many of the graphs
have multiple lines with the color code chart listing items such as
"median RTT from mipsrv01" - These are not MultiHost graphs, but rather
graphs with data from external smokeping servers.
To configure a smokeping master/slave server you can see the documentation
here:
http://oss.oetiker.ch/smokeping/doc/smokeping_master_slave.en.html
In addition, a sample set of steps for configuring this is available in
the file sample-smokeping-master-slave.txt which is available under the
attachments listing on the agenda page of the class wiki.