snmp_exporter - collect SNMP MIBs

Network devices and switches don’t have prometheus exporters. But fortunately there is snmp_exporter which can convert prometheus scrapes into SNMP queries.

Login to your campus server instance (srv1-campusX.ws.nsrc.org).

Start snmp_exporter

Enter the prometheus container to get a root shell:

incus shell prometheus

Check that snmp_exporter is running:

systemctl status snmp_exporter

Use cursor keys to move around the journalctl log output, and “q” to quit. If there are any errors, then ask for assistance..

Configure community string

Type the following to see which flags snmp_exporter is running with:

cat /etc/default/snmp_exporter

You should see:

OPTIONS='--config.file=/etc/prometheus/snmp.d/*.yml --web.listen-address=:9116'

It is reading all files from within that directory that end with *.yml. Have a look at what’s there:

cd /etc/prometheus/snmp.d
ls

There is a file snmp.yml which contains all the MIB definitions. We won’t touch this (there is a separate “generator” tool which can be used to compile MIBs into the correct format).

However, we do need to set up our SNMP credentials to communicate with our devices.

Create a file called auth.yml in this directory:

editor /etc/prometheus/snmp.d/auth.yml

and paste in the following exactly as shown:

auths:
  nsrc_v2:
    version: 2
    community: NetManage

  nsrc_v3:
    version: 3
    security_level: authNoPriv
    username: admin
    auth_protocol: SHA
    password: NetManage

These are defining the credentials we want to use for SNMPv2 and SNMPv3 respectively.

Save, then signal to snmp_exporter to pick up the change using “reload”:

systemctl reload snmp_exporter
journalctl -eu snmp_exporter     # check for errors

If there are any errors, fix them.

Manual scrape

Perform manual scrapes of two devices, using the following commands:

curl 'localhost:9116/snmp?module=if_mib&auth=nsrc_v3&target=gw.ws.nsrc.org'
curl 'localhost:9116/snmp?module=if_mib&auth=nsrc_v3&target=core1-campusX.ws.nsrc.org'

(the quotation marks are important, to stop the shell intepreting the ampersand as a special character)

Note that in each case the scrape is being sent to localhost (where snmp_exporter is running), but it includes three parameters: module says which MIB to retrieve, auth which credentials to use, and target tells snmp_exporter where to send the SNMP query (this can be either a resolvable DNS name or IP address)

You should get a large number of metrics back in prometheus format, e.g.

...
# HELP ifHCInOctets The total number of octets received on the interface, including framing characters - 1.3.6.1.2.1.31.1.1.1.6
# TYPE ifHCInOctets counter
ifHCInOctets{ifAlias="",ifDescr="Intel Corporation Ethernet Connection (2) I219-LM",ifIndex="2",ifName="eno1"} 448744
...

The comment shows the SNMP OID, but in each case it has been translated to a plain prometheus metric.

Prometheus configuration

Now we are ready to move onto configuring prometheus.

Firstly, configure a targets file /etc/prometheus/targets.d/snmp.yml containing the following:

- labels:
    module: if_mib
    auth: nsrc_v3
  targets:
    - gw.ws.nsrc.org
    - bdr1-campusX.ws.nsrc.org
    - core1-campusX.ws.nsrc.org

However we have a slight problem: we don’t want prometheus to scrape these targets directly. We want it to scrape the snmp_exporter on localhost and pass the target and module as parameters in the URL. To do this, we are going to need to use prometheus’ relabeling feature.

Edit /etc/prometheus/prometheus.yml and add the following to the bottom of the scrape_configs: section:

  - job_name: 'snmp'
    file_sd_configs:
      - files:
         - /etc/prometheus/targets.d/snmp.yml
    metrics_path: /snmp
    relabel_configs:
      - source_labels: [__address__]
        target_label: instance
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [module]
        target_label: __param_module
      - source_labels: [auth]
        target_label: __param_auth
      - target_label: __address__
        replacement: 127.0.0.1:9116  # SNMP exporter

Again, be careful with spacing. There should be two spaces before the dash before job_name:, so that it aligns exactly with the dashes of earlier scrape jobs.

For details how this works, see the end of this sheet.

Now get prometheus to pick up the changes:

systemctl reload prometheus
journalctl -eu prometheus    # CHECK FOR ERRORS!

Testing

Return to the prometheus web interface at http://oob-srv1-campusX.ws.nsrc.org/prometheus

Select the “Table” tab and run the following queries:

up{job="snmp"}

scrape_samples_scraped{job="snmp"}

The query “up” will return 1 for all target devices that were successfully scraped. “scrape_samples_scraped” will show the number of values retrieved; if it’s 0 then that means there was a problem with SNMP.

If there is a problem, you can check under Status > Targets which may show you more information. Sometimes it is helpful to use tcpdump to see the scrape attempts between prometheus and snmp_exporter:

tcpdump -i lo -nnA -s0 tcp port 9116

If scraping is successful, then you can now browse some of the values using the Table tab, for example:

ifOperStatus    # this is a gauge (values 1,2 etc defined in the MIB)

ifHCInOctets    # this is a counter

Can you remember how to change this counter into a rate in bits-per-second, so that you can get a traffic graph? Refer to the node_exporter exercise if you need to.

Adding more nodes

Add the border and core routers for ONE other campus in your targets file. Don’t do them all in case the 15-second polling interval overwhelms our platform.

Remember that you don’t need to reload prometheus after updating the targets file.

How relabeling works (reference only)

When prometheus reads a target file, it puts each entry into a hidden label called __address__. It also uses __address__ as the endpoint to scrape. Just before scraping, the __address__ is copied to a label called “instance” if one doesn’t exist. Finally, any label beginning with __ is removed from the result.

However, before scraping there is an optional relabeling phase, where a set of relabeling steps are applied in order. What we have done is the following steps:

      - source_labels: [__address__]
        target_label: instance

This copies the __address__ label to the instance label. Therefore we end up with a label like instance="gw.ws.nsrc.org"

      - source_labels: [__address__]
        target_label: __param_target

We also copy the __address__ label to __param_target; this gets applied as a parameter called “target” in the final scrape URL.

      - source_labels: [module]
        target_label: __param_module

Similarly, we copy the label module (which was applied in the targets file to the group of targets) to __param_module, which becomes a parameter called “module” in the scrape URL.

      - source_labels: [auth]
        target_label: __param_auth

And we copy the label auth to __param_auth, which becomes a parameter “auth” in the scrape URL.

      - target_label: __address__
        replacement: 127.0.0.1:9116

Finally, we replace __address__ with “127.0.0.1:9116”, which means that the actual scrape is sent to the snmp_exporter running on the local host. We also set metrics_path to /snmp, instead of the default which is /metrics, because this is what snmp_exporter requires.

The final scrape, therefore, goes to:

http://127.0.0.1:9116/snmp?target=<target>&module=<module>&auth=<auth>
              ^         ^            ^               ^            ^
              |         |            |               |            |
      new __address__   |      __param_target  __param_module     |
                        |      (from original               __param_auth
        metrics_path ---'        __address__)

There is a more detailed explanation in the prometheus documentation.