Network devices and switches don’t have prometheus exporters. But fortunately there is snmp_exporter which can convert prometheus scrapes into SNMP queries.
Login to your campus server instance (srv1-campusX.ws.nsrc.org).
Enter the prometheus container to get a root shell:
incus shell prometheus
Check that snmp_exporter is running:
systemctl status snmp_exporter
Use cursor keys to move around the journalctl log output, and “q” to quit. If there are any errors, then ask for assistance..
Type the following to see which flags snmp_exporter is running with:
cat /etc/default/snmp_exporter
You should see:
OPTIONS='--config.file=/etc/prometheus/snmp.d/*.yml --web.listen-address=:9116'
It is reading all files from within that directory that end with
*.yml. Have a look at what’s there:
cd /etc/prometheus/snmp.d
ls
There is a file snmp.yml which contains all the MIB
definitions. We won’t touch this (there is a separate “generator” tool
which can be used to compile MIBs into the correct format).
However, we do need to set up our SNMP credentials to communicate with our devices.
Create a file called auth.yml in this directory:
editor /etc/prometheus/snmp.d/auth.yml
and paste in the following exactly as shown:
auths:
nsrc_v2:
version: 2
community: NetManage
nsrc_v3:
version: 3
security_level: authNoPriv
username: admin
auth_protocol: SHA
password: NetManage
These are defining the credentials we want to use for SNMPv2 and SNMPv3 respectively.
Save, then signal to snmp_exporter to pick up the change using “reload”:
systemctl reload snmp_exporter
journalctl -eu snmp_exporter # check for errors
If there are any errors, fix them.
Perform manual scrapes of two devices, using the following commands:
curl 'localhost:9116/snmp?module=if_mib&auth=nsrc_v3&target=gw.ws.nsrc.org'
curl 'localhost:9116/snmp?module=if_mib&auth=nsrc_v3&target=core1-campusX.ws.nsrc.org'
(the quotation marks are important, to stop the shell intepreting the ampersand as a special character)
Note that in each case the scrape is being sent to
localhost (where snmp_exporter is running), but it includes
three parameters: module says which MIB to retrieve,
auth which credentials to use, and target
tells snmp_exporter where to send the SNMP query (this can be either a
resolvable DNS name or IP address)
You should get a large number of metrics back in prometheus format, e.g.
...
# HELP ifHCInOctets The total number of octets received on the interface, including framing characters - 1.3.6.1.2.1.31.1.1.1.6
# TYPE ifHCInOctets counter
ifHCInOctets{ifAlias="",ifDescr="Intel Corporation Ethernet Connection (2) I219-LM",ifIndex="2",ifName="eno1"} 448744
...
The comment shows the SNMP OID, but in each case it has been translated to a plain prometheus metric.
Now we are ready to move onto configuring prometheus.
Firstly, configure a targets file
/etc/prometheus/targets.d/snmp.yml containing the
following:
- labels:
module: if_mib
auth: nsrc_v3
targets:
- gw.ws.nsrc.org
- bdr1-campusX.ws.nsrc.org
- core1-campusX.ws.nsrc.org
However we have a slight problem: we don’t want prometheus to scrape these targets directly. We want it to scrape the snmp_exporter on localhost and pass the target and module as parameters in the URL. To do this, we are going to need to use prometheus’ relabeling feature.
Edit /etc/prometheus/prometheus.yml and add the
following to the bottom of the scrape_configs: section:
- job_name: 'snmp'
file_sd_configs:
- files:
- /etc/prometheus/targets.d/snmp.yml
metrics_path: /snmp
relabel_configs:
- source_labels: [__address__]
target_label: instance
- source_labels: [__address__]
target_label: __param_target
- source_labels: [module]
target_label: __param_module
- source_labels: [auth]
target_label: __param_auth
- target_label: __address__
replacement: 127.0.0.1:9116 # SNMP exporter
Again, be careful with spacing. There should be two spaces before the
dash before job_name:, so that it aligns exactly with the
dashes of earlier scrape jobs.
For details how this works, see the end of this sheet.
Now get prometheus to pick up the changes:
systemctl reload prometheus
journalctl -eu prometheus # CHECK FOR ERRORS!
Return to the prometheus web interface at http://oob-srv1-campusX.ws.nsrc.org/prometheus
Select the “Table” tab and run the following queries:
up{job="snmp"}
scrape_samples_scraped{job="snmp"}
The query “up” will return 1 for all target devices that were successfully scraped. “scrape_samples_scraped” will show the number of values retrieved; if it’s 0 then that means there was a problem with SNMP.
If there is a problem, you can check under Status > Targets which may show you more information. Sometimes it is helpful to use tcpdump to see the scrape attempts between prometheus and snmp_exporter:
tcpdump -i lo -nnA -s0 tcp port 9116
If scraping is successful, then you can now browse some of the values using the Table tab, for example:
ifOperStatus # this is a gauge (values 1,2 etc defined in the MIB)
ifHCInOctets # this is a counter
Can you remember how to change this counter into a rate in bits-per-second, so that you can get a traffic graph? Refer to the node_exporter exercise if you need to.
Add the border and core routers for ONE other campus in your targets file. Don’t do them all in case the 15-second polling interval overwhelms our platform.
Remember that you don’t need to reload prometheus after updating the targets file.
When prometheus reads a target file, it puts each entry into a hidden
label called __address__. It also uses
__address__ as the endpoint to scrape. Just before
scraping, the __address__ is copied to a label called
“instance” if one doesn’t exist. Finally, any label beginning with
__ is removed from the result.
However, before scraping there is an optional relabeling phase, where a set of relabeling steps are applied in order. What we have done is the following steps:
- source_labels: [__address__]
target_label: instance
This copies the __address__ label to the
instance label. Therefore we end up with a label like
instance="gw.ws.nsrc.org"
- source_labels: [__address__]
target_label: __param_target
We also copy the __address__ label to
__param_target; this gets applied as a parameter called
“target” in the final scrape URL.
- source_labels: [module]
target_label: __param_module
Similarly, we copy the label module (which was applied
in the targets file to the group of targets) to
__param_module, which becomes a parameter called “module”
in the scrape URL.
- source_labels: [auth]
target_label: __param_auth
And we copy the label auth to __param_auth,
which becomes a parameter “auth” in the scrape URL.
- target_label: __address__
replacement: 127.0.0.1:9116
Finally, we replace __address__ with “127.0.0.1:9116”,
which means that the actual scrape is sent to the snmp_exporter running
on the local host. We also set metrics_path to /snmp,
instead of the default which is /metrics, because this is
what snmp_exporter requires.
The final scrape, therefore, goes to:
http://127.0.0.1:9116/snmp?target=<target>&module=<module>&auth=<auth>
^ ^ ^ ^ ^
| | | | |
new __address__ | __param_target __param_module |
| (from original __param_auth
metrics_path ---' __address__)
There is a more detailed explanation in the prometheus documentation.