Network Management & Monitoring Smokeping Exercises ---------- 0. Connect to your PC as the sysadm user and start a root shell $ sudo bash # 1. Install Smokeping -------------------- # apt-get install smokeping Then point your web browser at http://pcN.ws.nsrc.org/cgi-bin/smokeping.cgi to check that it is running. 2. Initial Configuration ------------------------ # cd /etc/smokeping/config.d # ls -l -rwxr-xr-x 1 root root 578 2010-02-26 01:55 Alerts -rwxr-xr-x 1 root root 237 2010-02-26 01:55 Database -rwxr-xr-x 1 root root 413 2010-02-26 05:40 General -rwxr-xr-x 1 root root 271 2010-02-26 01:55 pathnames -rwxr-xr-x 1 root root 859 2010-02-26 01:55 Presentation -rwxr-xr-x 1 root root 116 2010-02-26 01:55 Probes -rwxr-xr-x 1 root root 155 2010-02-26 01:55 Slaves -rwxr-xr-x 1 root root 8990 2010-02-26 06:30 Targets The files that you'll need to change, at a minimum, are: * Alerts * General * Probes * Targets Now open the General file (note the first capital letter) # editor General Change the following lines (don't leave them indented): owner = NOC contact = sysadmin@localhost cgiurl = http://localhost/cgi-bin/smokeping.cgi mailhost = localhost # specify this to get syslog logging syslogfacility = local5 Save the file and exit. Now let's restart the Smokeping service to verify that no mistakes have been made before going any further: # service smokeping stop # service smokeping start Warning! The "restart" option is not reliable. Use "stop" and "start" instead. 3. Configure monitoring of devices ---------------------------------- The majority of your time and work configuring Smokeping will be done in the file /etc/smokeping/config.d/Targets. For this class please do the following: Use the default FPing probe to check: - all the student NOC PCs - classroom NOC - switches - routers You can use the classroom Network Diagram on the classroom wiki to figure out addresses for each item, etc. Create some hierarchy to the Smokeping menu for your checks. For example, the Targets file is already partially preconfigured. To start we are going to add some entries to this file. Start with: # cd /etc/smokeping/config.d # editor Targets You can take the section from *** Targets *** to the end of the LocalMachine and make it look something like this. Feel free to use your own "remark", "menu" text and titles. The ">>>>>>>>" are not in the file, this indicates start of text, then "<<<<<<<<" is end of text: >>>>>>>> *** Targets *** probe = FPing ## You have to edit and uncomment all what you want below this. # Please, refer to smokeping_config man page for more info # The given adresses aren't real to avoid DoS. menu = Top title = Network Latency Grapher remark = Welcome to the SmokePing Latency Grapher for \ the GARNET-AfNOG-KNUST-NSRC Workshop +Local menu = Network Monitoring and Management title = NOC Server for Network Monitoring Class ++LocalMachine menu = localhost title = localhost host = localhost <<<<<<< Now, below the "localhost" we start with the configuration of items for our class. We can start simple and add just the first 4 PCs that are in Group 1 as well as an entry for our classroom NOC machine and our three Mac Mini server boxes. Warning! If you do not have properly functioning DNS resolution, then you will need to use the complete Fully Qualified Domain Name (FQDN) for each machine you are monitoring. Thus, instead of "host = pc1" you will need to specificy "host = pc1.ws.nsrc.org" >>>>>>>> # # ********* Classroom Servers ********** # +Servers menu = Servers title = Network Management Servers ++noc menu = noc title = Workshop NOC host = noc ++s1 menu = s1 title = s1 (Host MacMini for Student PCs) host = s1 ++s2 menu = s2 title = s2 (Host MacMini for Student PCs) host = s2 ++s3 menu = s3 title = s3 (Host MacMinit for Student PCs) host = s3 # # ******** Student Machines (VMs) *********** # +PCs menu = Lab PCs title = Virtual PCs Network Management ++pc1 menu = pc1 title = Virtual Machine 1 host = pc1 ++pc2 menu = pc2 title = Virtual Machine 2 host = pc2 ++pc3 menu = pc3 title = Virtual Machine 3 host = pc3 ++pc4 menu = pc4 title = Virtual Machine 4 host = pc4 <<<<<<<< OK. Let's see if we can get Smokeping to stop and start with the changes we have made, so far. Save and exit from the Targets file. Now try doing: # service smokeping stop # service smokeping start If you see error messages, then read them closely and try to correct the problem in the Targets file. In addition, Smokeping is now sending log message to the file /var/log/messages. You can view what Smokeping is saying by typing: # tail /var/log/messages If you want to see all smokeping related messages in the file /var/log/message you can do this: # grep smokeping /var/log/messages If there are no errors you can view the results of your changes by going to: http://pcN.ws.nsrc.org/cgi-bin/smokeping.cgi When you are read you can edit the Targets file again and continue to add machines. At the bottom of the file you can add the next group of PCs: >>>>>>>> ++pc5 menu = pc5 title = Virtual Machine 5 host = pc5 ++pc6 menu = pc6 title = Virtual Machine 6 host = pc6 ++pc7 menu = pc7 title = Virtual Machine 7 host = pc7 ++pc8 menu = pc8 title = Virtual Machine 8 host = pc8 <<<<<<<< Add as many PCs as you want, then Save and exit from the Targets file and verify that the changes you have made are working: # service smokeping stop # service smokeping start You can continue to view the updated results of your changes on the Smokeping web page. It may take up to 5 minutes before graphs beging to appear. http://pcN.ws.nsrc.org/cgi-bin/smokeping.cgi 4. Configure monitoring of routers and switches ----------------------------------------------- Once you have configured as many PCs as you want to configure, then it's time to add in some entries for the classroom routers and switch(es). # cd /etc/smokeping/config.d (just to be sure :-)) # editor Targets Go to the bottom of the file and add in some entries for routers and switches: >>>>>>>> # # ********** Classroom Backbone Switch ********* # +Switches menu = Switches title = Switches Network Management ++sw menu = sw title = Backbone Switch host = sw # # ********** Virtual Routers: Cisco 7200 images ********* # +Routers menu = Routers title = Virtual and Physical Routers Network Management ++gw menu = rtr title = Gateway Router host = rtr ++router1 menu = router1 title = Virtual Router 1 host = rtr1 ++router2 menu = router2 title = Virtual Router 2 host = rtr2 ++router3 menu = router3 title = Virtual Router 3 host = rtr3 <<<<<<<< If you wish you can continue and add in entries for routers 4 to 6, or to 9. When you are ready Save and Exit from the Targets file and verify your work: # service smokeping stop # service smokeping start If you want you might consider adding the Wireless Access Points: # editor Targets >>>>>>>> ++ap1 menu = ap1 title = Wireless Access Point 1 host = ap1 ++ap2 menu = ap2 title = Wireless Access Point 2 host = ap2 <<<<<<<< 5. Add new probes to Smokeping ------------------------------ The current entry in the Probes file is fine, but if you wish to use additional Smokeping checks you can add them in here and you can specify their default behavior. You can do this, as well, in the Targets file if you wish. To add a probe to check for HTTP latency as well as DNS lookup latency add the following to the end of the Probes file: # editor Probes >>>>>>>> + EchoPingHttp + DNS binary = /usr/bin/dig pings = 5 step = 180 lookup = www.nsrc.org <<<<<<<< The DNS probe will look up the IP address of www.nsrc.org using any other open DNS server you specify in the Targets file. You will see this a bit futher on. Now Save and exit from the file and verify that your changes are working: # service smokeping stop # service smokeping start 6. add HTTP latency checks for the classroom PCs ------------------------------------------------ Edit the Targets file again and go to the end of the file: # editor Targets At the end of the file add: >>>>>>>> # # Web server response # +HTTP menu = HTTP Response title = HTTP Response Student PCs ++pc1 menu = pc1 title = pc1 HTTP response time probe = EchoPingHttp host = pc1 ++pc2 menu = pc2 title = pc2 HTTP response time probe = EchoPingHttp host = pc2 ++pc3 menu = pc3 title = pc3 HTTP response time probe = EchoPingHttp host = pc3 ++pc4 menu = pc4 title = pc1 HTTP response time probe = EchoPingHttp host = pc4 <<<<<<<< You could actually just use the "probe = EchoPingHttp" statement once for pc1, and then this would be the default probe until another "probe = " statement is seen in the Targets file. You can add more PC entries if you wish, or you could consider checking the latency on remote machines - these are likely to be more interesting. Machines such as your own publicly accessible servers are a good choice, or, perhaps other web servers you use often (Google, Yahoo, Government pages, stores, etc.?). Once you are done, save and exit from the Targets file and verify your work: # service smokeping stop # service smokeping start 7. Add DNS latency checks ------------------------- At the end of the Targets file we are going to add some entries to verify the latency from our location to remote recursive DNS servers to look up an entry for nsrc.org. You would likely substitue an important address for your institution in the Probes file instead. In addition, you can change the address you are looking up inside the Targets file as well. For more information see: http://oss.oetiker.ch/smokeping/probe/DNS.en.html and http://oss.oetiker.ch/smokeping/probe/index.en.html Now edit the Targets file again. Be sure to go to the end of the file: # cd /etc/smokeping/config.d (just to be sure...) # editor Targets At the end of the file add: >>>>>>>> # # Sample DNS probe # +DNS probe = DNS menu = DNS Latency title = DNS Latency Probes ++LocalDNS1 menu = 10.10.0.250 title = DNS Delay for local DNS Server on noc.ws.nsrc.org host = noc.ws.nsrc.org ++GoogleA menu = 8.8.8.8 title = DNS Latency for google-public-dns-a.google.com host = google-public-dns-a.google.com ++GoogleB menu = 8.8.8.4 title = DNS Latency for google-public-dns-b.google.com host = google-public-dns-b.google.com ++OpenDNSA menu = 208.67.222.222 title = DNS Latency for resolver1.opendns.com host = resolver1.opendns.com ++OpenDNSB menu = 208.67.220.220 title = DNS Latency for resolver2.opendns.com host = resolver2.opendns.com <<<<<<<< Now save the Targets file and exit and verify your work: # service smokeping stop # service smokeping start Look at additional Smokeping probes and consider implementing some of them if they are useful to your ogranization: http://oss.oetiker.ch/smokeping/probe/index.en.html 8. MultiHost graphing --------------------- Once you have defined a group of hosts under a single probe type in your /etc/smokeping/config.d/Targets file, then you can create a single graph that will show you the results of all smokeping tests for all hosts that you define. This has the advantage of letting you quickly compare, for example, a group of hosts that you are monitoring with the FPing probe. The MultiHost graph function in Smokeping is extremely picky - pay close attention! To create a MultiHost graph first edit the file Targets: # editor Targets Find the end of your initial PC definitions. It should be just before you started to configure your routers and switches. That section starts with: >>>>>>>> # # ********** Classroom Backbone Switch ********* # <<<<<<<< So, just _before_ this we'll create two MultiHost entries. One will be for PCs number 1-12, or all the PCs in groups 1 to 3, and the other will be for PCs number 13-24, or all the PCs in groups 4 to 7. Warning! If you have not already configured PCs 1 to 24, then do not configure any entries with PCs that are not yet defined. Now add the two MultiHost entries. They look like this: >>>>>>>> ++MultihostHTTPGroups1-3 menu = MultihostHTTPGroups1-3 title = Combined HTTP Results host = /HTTP/pc1 /HTTP/pc2 /HTTP/pc3 /HTTP/pc4 /HTTP/pc5 /HTTP/pc6 \ /HTTP/pc7 /HTTP/pc8 /HTTP/pc9 /HTTP/pc10 /HTTP/pc11 /HTTP/pc12 ++MultihostHTTPGroups4-6 menu = MultihostHTTPGroups4-6 title = Combined HTTP Results host = /HTTP/pc13 /HTTP/pc14 /HTTP/pc15 /HTTP/pc16 /HTTP/pc17 /HTTP/pc18 \ /HTTP/pc19 /HTTP/pc20 /HTTP/pc21 /HTTP/pc22 /HTTP/pc23 /HTTP/pc24 <<<<<<<< Save and exit from the Targets file. Now attempt to restart Smokeping: # service smokeping stop # service smokeping start If this fails you almost certainly have an error in the entries. If you cannot figure out what the error is (remember to try "tail /var/log/messages" first!) ask your instructor for some help. If things work and you want to add a MultiHost entry for your DNS servers, then edit the file Targets, but go to the very end of the file and add: >>>>>>>> # # Multihost Graph of all DNS latency checks # ++MultiHostDNS menu = MultiHost DNS title = Consolidated DNS Responses host = /DNS/LocalDNS1 /DNS/GoogleA /DNS/GoogleB /DNS/OpenDNSA /DNS/OpenDNSB \ /DNS/DNSAdvantageA /DNS/DNSAdvantageB <<<<<<<< And, as always, save and exit from the file Targets and test your new configuration. 9. Send Smokeping alerts ------------------------ If you wish to receive an email when an alert condition is met on one of the Smokeping checks first do this: # cd /etc/smokeping/config.d # editor Alerts Update the top of the file where it says: *** Alerts *** to = alertee@address.somewhere from = smokealert@company.xy to include a proper "to" and "from" field for your server. Something like: *** Alerts *** to = sysadm@localhost from = smokeping-alert@localhost Now you must update your device entries to include a line that reads: alerts = alertName1, alertName2, etc, etc... For instance, the alerts named, "startloss", "bigloss", and "rttdetect" have already been defined in the file Alerts: To read about Smokeping alerts and what they are detecting, how to create your own, etc. see: http://oss.oetiker.ch/smokeping/doc/smokeping_config.en.html and at the bottom of the page is a section titled, "*** Alerts ***" To place some alert detection on some of your hosts open the file Targets: # editor Targets and go near the start of the file where we defined our PCs. Just under the "host =" line add another line that looks like this: alerts = startloss,bigloss,rttdetect So, for example, the pc1 entry would not look like this: >>>>>>>> ++pc1 menu = pc1 title = Virtual Machine 1 host = pc1 alerts = startloss,bigloss,rttdetect <<<<<<<< If you want to add an alerts option to other hosts go ahead. Once you are done save and exit from the Targets file and then verify that your configuration works: # service smokeping stop # service smokeping start If any of the hosts that have the "alerts = " option set meet the conditions to set off the alert, then an email will arrive to the sysadm user's mailbox on the Smokeping server machine (localhost). It's not likely that an alert will be set off for most machines. To check you can read the email for the sysadm user by using an email client like "mutt" - # apt-get install mutt # mutt Say yes to mailbox creation when prompted, then see if you have email from the smokeping-alerts@localhost user. 10. Slave instances - Informational Only ---------------------------------------- This is a description only for informational purposes in case you wish to attempt this type of configuration once the workshop is over. The idea behind this is that you can run multiple smokeping instances at multiple locations that are monitoring the same hosts and/or services as your master instance. The slaves will send their results to the master server and you will see these results side-by-side with your local results. This allows you to view how users outside your network see your services and hosts. This can be a powerful tool for resolving service and host issues that may be difficult to troubleshoot if you only have local data. Graphically this looks this: [slave 1] [slave 2] [slave 3] | | | +-------+ | +--------+ | | | v v v +---------------+ | master | +---------------+ You can see example of this data here: http://oss.oetiker.ch/smokeping-demo/ Look at the various graph groups and notice that many of the graphs have multiple lines with the color code chart listing items such as "median RTT from mipsrv01" - These are not MultiHost graphs, but rather graphs with data from external smokeping servers. To configure a smokeping master/slave server you can see the documentation here: http://oss.oetiker.ch/smokeping/doc/smokeping_master_slave.en.html In addition, a sample set of steps for configuring this is available in the file sample-smokeping-master-slave.txt which should be lisetd as an additional reference at the bottom of the Agenda page on your classroom wiki.