There are a number of things you can do to secure your prometheus installation.
At the moment, node_exporter is open to everyone. Someone who kept hitting the scrape endpoint repeatedly could cause a denial-of-service. Furthermore, the traffic is not encrypted.
We could secure the node_exporter by putting a reverse proxy in front, such as apache, nginx, or exporter_exporter
However, node_exporter has recently gained built-in TLS built-in, so we will use this. Not only will the traffic be encrypted, but we will use TLS certificates to authenticate.
To do this properly you would set up a certificate authority. However we’re going to use a very simple config, where all the nodes use one key and certificate, and the prometheus server uses a different key and certificate.
This allows you to push out the same key and certificate on all your nodes, but only the prometheus server is authorised to scrape them.
NOTE: this requires node_exporter 1.0.0 or later
Firstly, check that your node exporter is running as expected.
curl srv1.campusX.ws.nsrc.org:9100/metrics
(You can also try scraping another campus’ node_exporter, to prove that it is insecure!)
Next, we will create a key and certificate for node_exporter to use:
mkdir /etc/prometheus/ssl
cd /etc/prometheus/ssl
openssl req -x509 -newkey rsa:1024 -keyout prom_node_key.pem -out prom_node_cert.pem -days 29220 -nodes -subj /commonName=prom_node/ -addext "subjectAltName=DNS:prom_node"
Type ls and you should see two files: prom_node_cert.pem and prom_node_key.pem. This is how the node_exporter identifies itself to prometheus.
Next, create a file /etc/prometheus/node_tls.yml with the following contents:
tlsConfig:
tlsCertPath: /etc/prometheus/ssl/prom_node_cert.pem
tlsKeyPath: /etc/prometheus/ssl/prom_node_key.pem
Next, edit /etc/default/node_exporter to add option --web.config=/etc/prometheus/node_tls.yml. For example, if it was like this:
OPTIONS='--collector.textfile.directory=/var/lib/node_exporter'
then it will become:
OPTIONS='--collector.textfile.directory=/var/lib/node_exporter --web.config=/etc/prometheus/node_tls.yml'
Restart it and check for errors:
systemctl restart node_exporter
journalctl -eu node_exporter
Now we can do a test scrape using curl and https:
curl --cacert /etc/prometheus/ssl/prom_node_cert.pem --resolve prom_node:9100:127.0.0.1 -v https://prom_node:9100/metrics
The scrape should be successful. We’ve done it over https. We’ve used the fake hostname “prom_node” to match the certificate, and told curl to use address 127.0.0.1 for this hostname, and to verify the certificate in prom_node_cert.pem
So far, we’ve made the scrape encrypted over TLS, but still anyone is authorized to scrape. So now we need to make a new key and cert for the prometheus server to use when scraping, and configure node_exporter so that it only accepts scrapes from someone with this key.
Create the new key and cert for prometheus:
cd /etc/prometheus/ssl
openssl req -x509 -newkey rsa:1024 -keyout prometheus_key.pem -out prometheus_cert.pem -days 29220 -nodes -subj /commonName=prometheus/ -addext "subjectAltName=DNS:prometheus"
Edit /etc/prometheus/node_tls.yml so it looks like this:
tlsConfig:
tlsCertPath: /etc/prometheus/ssl/prom_node_cert.pem
tlsKeyPath: /etc/prometheus/ssl/prom_node_key.pem
clientAuth: RequireAndVerifyClientCert
clientCAs: /etc/prometheus/ssl/prometheus_cert.pem
Restart node_exporter:
systemctl restart node_exporter
journalctl -eu node_exporter
Now re-run the exact same curl command as you did before:
curl --cacert /etc/prometheus/ssl/prom_node_cert.pem --resolve prom_node:9100:127.0.0.1 -v https://prom_node:9100/metrics
You should see an error:
curl: (35) gnutls_handshake() failed: Certificate is bad
This is because the client isn’t presenting a certificate to the server to identify itself.
We now need to give a longer curl line (split for clarity):
curl --cert /etc/prometheus/ssl/prometheus_cert.pem \
--key /etc/prometheus/ssl/prometheus_key.pem \
--cacert /etc/prometheus/ssl/prom_node_cert.pem \
--resolve prom_node:9100:127.0.0.1 \
-v https://prom_node:9100/metrics
This should now work. We’ve proved our identity to node_exporter using our private key, and it will now talk to us.
If you don’t understand what’s going on here, please talk to your instructors!
At this point, prometheus should be failing to scrape our node. You can check like this:
/opt/prometheus/promtool query instant http://localhost:9090/prometheus up
Look for a line in the results like this:
up{instance="srv1.campusX.ws.nsrc.org:9100", job="node"} => 0 @[1582210727.15]
The “=> 0” is the scrape result (0=fail, 1=success).
We now have to update prometheus to scrape using TLS, in the same way as we have been doing with curl.
Edit /etc/prometheus/prometheus.yml and find the section which starts job_name: 'node'. Edit it so it looks like this:
- job_name: 'node'
file_sd_configs:
- files:
- /etc/prometheus/targets.d/node.yml
scheme: https
tls_config:
# Verifying remote identity
ca_file: /etc/prometheus/ssl/prom_node_cert.pem
server_name: prom_node
# Asserting our identity
cert_file: /etc/prometheus/ssl/prometheus_cert.pem
key_file: /etc/prometheus/ssl/prometheus_key.pem
Signal prometheus to re-read its configuration:
killall -HUP prometheus
journalctl -eu prometheus
Re-run the “promtool query” command to check the “up” metric. Within a minute, the result should change from 0 to 1. We are successfully scraping over TLS, with authentication!
NOTE: if you are scraping other campus servers, these will still FAIL. This is because the other campuses are using a different key and certificate for their prometheus server, and they only trust their own. This is exactly how it’s meant to work: each campus is has now locked out access from the other campuses.
Of course, normally prometheus isn’t normally just scraping itself, it’s scraping remote nodes. To deploy this change to remote nodes, you would copy the following files to them:
/etc/default/node_exporter/etc/prometheus/node_tls.yml/etc/prometheus/ssl/prom_node_cert.pem/etc/prometheus/ssl/prom_node_key.pem/etc/prometheus/ssl/prometheus_cert.pembut NOT prometheus_key.pem. That file is private to the prometheus server only; it’s ownership of this key which proves the prometheus server’s identity.
(Also: the prometheus server itself doesn’t need prom_node_key.pem, but you’ll need it if node_exporter is running on the same server)
If you have deployed node_exporter to your hostX virtual machines, you can update them to use TLS now.
The standard way to secure these applications is to put them behind a HTTPS reverse proxy (e.g. apache or nginx) which:
Grafana has its own authentication, so only TLS is required.
Doing this configuration is outside the scope of this exercise. However there are a couple of things which you should remember:
--web.listen-address=127.0.0.1:9090 (or 9093 for alertmanager)one option is to have a separate virtual host for each (e.g. prometheus.example.net, alertmanager.example.net) which both point to the IP address of your reverse proxy. You’ll need to generate a TLS certificate with all the names in it.
another option is to use URL path prefixes like /prometheus and /alertmanager. If you do this, you’ll need to configure more options so that all generated URLs have the correct prefixes:
--web.external-url=https://noc.example.net/prometheus
This is what we’ve chosen to do in this workshop.