Network devices and switches don’t have prometheus exporters. But fortunately there is snmp_exporter which can convert prometheus scrapes into SNMP queries.

Do this on your campus server instance (srv1.campusX.ws.nsrc.org)

1 Install snmp_exporter

(If snmp_exporter is pre-installed, skip to the next section “Start snmp_exporter”)

Fetch and unpack the latest release from the releases page and create a symlink so that /opt/snmp_exporter refers to the current version.

wget https://github.com/prometheus/snmp_exporter/releases/download/v0.19.0/snmp_exporter-0.19.0.linux-amd64.tar.gz
tar -C /opt -xvzf snmp_exporter-0.19.0.linux-amd64.tar.gz
ln -s snmp_exporter-0.19.0.linux-amd64 /opt/snmp_exporter

Use a text editor to create a systemd unit file /etc/systemd/system/snmp_exporter.service with the following contents:

[Unit]
Description=Prometheus SNMP Exporter
Documentation=https://github.com/prometheus/snmp_exporter
After=network-online.target

[Service]
User=prometheus
Restart=on-failure
RestartSec=5
EnvironmentFile=/etc/default/snmp_exporter
ExecStart=/opt/snmp_exporter/snmp_exporter $OPTIONS
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

Tell systemd to read this new file:

systemctl daemon-reload

Also create an options file /etc/default/snmp_exporter with the following contents:

OPTIONS='--config.file=/etc/prometheus/snmp/snmp.yml --web.listen-address=127.0.0.1:9116'

Create the initial default configuration:

mkdir /etc/prometheus/snmp
cp /opt/snmp_exporter/snmp.yml /etc/prometheus/snmp/

2 Start snmp_exporter

Let’s start snmp_exporter:

systemctl enable snmp_exporter  # start on future boots
systemctl start snmp_exporter   # start now
journalctl -eu snmp_exporter    # check for "Listening on address" address=:9116

Use cursor keys to move around the journalctl log output, and “q” to quit. If there are any errors, then go back and fix them.

3 Configure community string

snmp_exporter’s configuration file is generated using a separate “generator” tool, whose input is the MIBs, a higher level description of requirements, and also combines the security credentials like SNMPv2/v3 keys.

Unfortunately, the “generator” tool is currently not bundled and has to be built from source, making it inconvenient to re-use.

Instead, we will modify the existing file as a workaround.

Edit /etc/prometheus/snmp/snmp.yml

Search down to the line which starts if_mib:. This is around line 5,405. Don’t scroll by hand - use your editor’s search function!

Add &if_mib to the end of that line, so it looks like this:

if_mib: &if_mib

Now scroll down to the very end of the file (again - don’t scroll by hand, it’s over 17,000 lines), and add the following:

if_mib_v3:
  <<: *if_mib
  version: 3
  timeout: 3s
  retries: 3
  auth:
    security_level: authNoPriv
    username: admin
    password: NetManage
    auth_protocol: SHA

What this does is creates a new module “if_mib_v3” which is a copy of “if_mib” but with the security settings overridden for SNMPv3 with our credentials.

Save, and signal snmp_exporter to pick up the change:

killall -HUP snmp_exporter
journalctl -eu snmp_exporter     # check for errors

If there are any errors, fix them.

4 Manual scrape

Perform manual scrapes of two devices, using the following commands:

curl 'localhost:9116/snmp?module=if_mib_v3&target=gw.ws.nsrc.org'
curl 'localhost:9116/snmp?module=if_mib_v3&target=core1.campusX.nsrc.org'

Note that in each case the scrape is being sent to localhost (where snmp_exporter is running), but it includes two parameters: module says which MIB and credentials to use, and target tells snmp_exporter where to send the SNMP query (this can be either a resolvable DNS name or IP address)

You should get a large number of metrics back in prometheus format, e.g.

# HELP ifHCInOctets The total number of octets received on the interface, including framing characters - 1.3.6.1.2.1.31.1.1.1.6
# TYPE ifHCInOctets counter
ifHCInOctets{ifAlias="",ifDescr="Intel Corporation Ethernet Connection (2) I219-LM",ifIndex="2",ifName="eno1"} 448744
...
# HELP sysUpTime The time (in hundredths of a second) since the network management portion of the system was last re-initialized. - 1.3.6.1.2.1.1.3
# TYPE sysUpTime gauge
sysUpTime 1.20071e+06

The comment shows the SNMP OID, but in each case it has been translated to a plain prometheus metric.

5 Prometheus configuration

Now we are ready to move onto configuring prometheus.

Firstly, configure a targets file /etc/prometheus/targets.d/snmp.yml containing the following:

- labels:
    module: if_mib_v3
  targets:
    - gw.ws.nsrc.org
    - bdr1.campusX.ws.nsrc.org
    - core1.campusX.ws.nsrc.org

However we have a slight problem: we don’t want prometheus to scrape these targets directly. We want it to scrape the snmp_exporter on localhost and pass the target and module as parameters in the URL. To do this, we are going to need to use prometheus’ relabeling feature.

Edit /etc/prometheus/prometheus.yml and add the following to the bottom of the scrape_configs: section:

  - job_name: 'snmp'
    file_sd_configs:
      - files:
         - /etc/prometheus/targets.d/snmp.yml
    metrics_path: /snmp
    relabel_configs:
      - source_labels: [__address__]
        target_label: instance
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [module]
        target_label: __param_module
      - target_label: __address__
        replacement: 127.0.0.1:9116  # SNMP exporter

Again, be careful with spacing. The dash before job_name should align exactly with the dashes of earlier job_name entries.

For details how this works, see the end of this sheet.

Now get prometheus to pick up the changes:

killall -HUP prometheus
journalctl -eu prometheus    # CHECK FOR ERRORS!

5.1 Testing

Return to the prometheus web interface at http://oob.srv1.campusX.ws.nsrc.org/prometheus

Run the following queries:

up{job="snmp"}

scrape_samples_scraped{job="snmp"}

The query “up” will return 1 for all target devices - even if the SNMP query fails - because snmp_exporter itself is working. However “scrape_samples_scraped” will show the number of values retrieved; if it’s 0 then that means there was a problem with SNMP.

If there is a problem, sometimes it is helpful to use tcpdump to see the scrape attempts between prometheus and snmp_exporter:

tcpdump -i lo -nnA -s0 tcp port 9116

If scraping is successful, then you can now browse some of the values using the Console tab, for example:

ifOperStatus    # this is a gauge (values 1,2 etc defined in the MIB)

ifHCInOctets    # this is a counter

Can you remember how to change this counter into a rate in bits-per-second, so that you can get a traffic graph? Refer to the node_exporter exercise if you need to.

5.2 Adding more nodes

Add the border and core routers for ONE other campus in your targets file. Don’t do them all in case the 15-second polling interval overwhelms our platform.

Remember that you don’t need to HUP prometheus after updating the targets file.

6 How relabeling works (reference only)

When prometheus reads a target file, it puts each entry into a hidden label called __address__. It also uses __address__ as the endpoint to scrape. After scraping, the __address__ is copied to a label called “instance” if one doesn’t exist. Finally, any label beginning with __ is removed from the result.

However, before scraping there is an optional relabeling phase, where a set of relabeling steps are applied in order. What we have done is:

      - source_labels: [__address__]
        target_label: instance

This copies the __address__ label to the instance label. Therefore we end up with a label like instance="gw.ws.nsrc.org"

      - source_labels: [__address__]
        target_label: __param_target

We also copy the __address__ label to __param_target; this gets applied as a parameter called “target” in the final URL

      - source_labels: [module]
        target_label: __param_module

Similarly, we copy the label module (which was applied in the targets file to the group of targets) to __param_module

      - target_label: __address__
        replacement: 127.0.0.1:9116  # SNMP exporter

Finally, we replace __address__ with “127.0.0.1:9116”, which means that the actual scrape is sent to the snmp_exporter running on the local host. We also set metrics_path to /snmp, instead of the default which is /metrics, because this is what snmp_exporter requires.