Network devices and switches don’t have prometheus exporters. But fortunately there is snmp_exporter which can convert prometheus scrapes into SNMP queries.
Login to your campus server instance (srv1-campusX.ws.nsrc.org).
Enter the prometheus container to get a root shell:
incus shell prometheus
Check that snmp_exporter is running:
systemctl status snmp_exporter
Use cursor keys to move around the journalctl log output, and “q” to quit. If there are any errors, then ask for assistance..
Type the following to see which flags snmp_exporter is running with:
cat /etc/default/snmp_exporter
You should see:
OPTIONS='--config.file=/etc/prometheus/snmp.d/*.yml --web.listen-address=:9116'
It is reading all files from within that directory that end with *.yml
. Have a look at what’s there:
cd /etc/prometheus/snmp.d
ls
There is a file snmp.yml
which contains all the MIB definitions. We won’t touch this (there is a separate “generator” tool which can be used to compile MIBs into the correct format).
However, we do need to set up our SNMP credentials to communicate with our devices.
Create a file called auth.yml
in this directory:
editor /etc/prometheus/snmp.d/auth.yml
and paste in the following exactly as shown:
auths:
nsrc_v2:
version: 2
community: NetManage
nsrc_v3:
version: 3
security_level: authNoPriv
username: admin
auth_protocol: SHA
password: NetManage
These are defining the credentials we want to use for SNMPv2 and SNMPv3 respectively.
Save, then signal to snmp_exporter to pick up the change using “reload”:
systemctl reload snmp_exporter
journalctl -eu snmp_exporter # check for errors
If there are any errors, fix them.
Perform manual scrapes of two devices, using the following commands:
curl 'localhost:9116/snmp?module=if_mib&auth=nsrc_v3&target=gw.ws.nsrc.org'
curl 'localhost:9116/snmp?module=if_mib&auth=nsrc_v3&target=core1-campusX.ws.nsrc.org'
(the quotation marks are important, to stop the shell intepreting the ampersand as a special character)
Note that in each case the scrape is being sent to localhost
(where snmp_exporter is running), but it includes three parameters: module
says which MIB to retrieve, auth
which credentials to use, and target
tells snmp_exporter where to send the SNMP query (this can be either a resolvable DNS name or IP address)
You should get a large number of metrics back in prometheus format, e.g.
...
# HELP ifHCInOctets The total number of octets received on the interface, including framing characters - 1.3.6.1.2.1.31.1.1.1.6
# TYPE ifHCInOctets counter
ifHCInOctets{ifAlias="",ifDescr="Intel Corporation Ethernet Connection (2) I219-LM",ifIndex="2",ifName="eno1"} 448744
...
The comment shows the SNMP OID, but in each case it has been translated to a plain prometheus metric.
Now we are ready to move onto configuring prometheus.
Firstly, configure a targets file /etc/prometheus/targets.d/snmp.yml
containing the following:
- labels:
module: if_mib
auth: nsrc_v3
targets:
- gw.ws.nsrc.org
- bdr1-campusX.ws.nsrc.org
- core1-campusX.ws.nsrc.org
However we have a slight problem: we don’t want prometheus to scrape these targets directly. We want it to scrape the snmp_exporter on localhost and pass the target and module as parameters in the URL. To do this, we are going to need to use prometheus’ relabeling feature.
Edit /etc/prometheus/prometheus.yml
and add the following to the bottom of the scrape_configs:
section:
- job_name: 'snmp'
file_sd_configs:
- files:
- /etc/prometheus/targets.d/snmp.yml
metrics_path: /snmp
relabel_configs:
- source_labels: [__address__]
target_label: instance
- source_labels: [__address__]
target_label: __param_target
- source_labels: [module]
target_label: __param_module
- source_labels: [auth]
target_label: __param_auth
- target_label: __address__
replacement: 127.0.0.1:9116 # SNMP exporter
Again, be careful with spacing. There should be two spaces before the dash before job_name:
, so that it aligns exactly with the dashes of earlier scrape jobs.
For details how this works, see the end of this sheet.
Now get prometheus to pick up the changes:
systemctl reload prometheus
journalctl -eu prometheus # CHECK FOR ERRORS!
Return to the prometheus web interface at http://oob-srv1-campusX.ws.nsrc.org/prometheus
Select the “Table” tab and run the following queries:
up{job="snmp"}
scrape_samples_scraped{job="snmp"}
The query “up” will return 1 for all target devices that were successfully scraped. “scrape_samples_scraped” will show the number of values retrieved; if it’s 0 then that means there was a problem with SNMP.
If there is a problem, you can check under Status > Targets which may show you more information. Sometimes it is helpful to use tcpdump to see the scrape attempts between prometheus and snmp_exporter:
tcpdump -i lo -nnA -s0 tcp port 9116
If scraping is successful, then you can now browse some of the values using the Table tab, for example:
ifOperStatus # this is a gauge (values 1,2 etc defined in the MIB)
ifHCInOctets # this is a counter
Can you remember how to change this counter into a rate in bits-per-second, so that you can get a traffic graph? Refer to the node_exporter exercise if you need to.
Add the border and core routers for ONE other campus in your targets file. Don’t do them all in case the 15-second polling interval overwhelms our platform.
Remember that you don’t need to reload prometheus after updating the targets file.
When prometheus reads a target file, it puts each entry into a hidden label called __address__
. It also uses __address__
as the endpoint to scrape. Just before scraping, the __address__
is copied to a label called “instance” if one doesn’t exist. Finally, any label beginning with __
is removed from the result.
However, before scraping there is an optional relabeling phase, where a set of relabeling steps are applied in order. What we have done is the following steps:
- source_labels: [__address__]
target_label: instance
This copies the __address__
label to the instance
label. Therefore we end up with a label like instance="gw.ws.nsrc.org"
- source_labels: [__address__]
target_label: __param_target
We also copy the __address__
label to __param_target
; this gets applied as a parameter called “target” in the final scrape URL.
- source_labels: [module]
target_label: __param_module
Similarly, we copy the label module
(which was applied in the targets file to the group of targets) to __param_module
, which becomes a parameter called “module” in the scrape URL.
- source_labels: [auth]
target_label: __param_auth
And we copy the label auth
to __param_auth
, which becomes a parameter “auth” in the scrape URL.
- target_label: __address__
replacement: 127.0.0.1:9116
Finally, we replace __address__
with “127.0.0.1:9116”, which means that the actual scrape is sent to the snmp_exporter running on the local host. We also set metrics_path to /snmp
, instead of the default which is /metrics
, because this is what snmp_exporter requires.
The final scrape, therefore, goes to:
http://127.0.0.1:9116/snmp?target=<target>&module=<module>&auth=<auth>
^ ^ ^ ^ ^
| | | | |
new __address__ | __param_target __param_module |
| (from original __param_auth
metrics_path ---' __address__)
There is a more detailed explanation in the prometheus documentation.