In this exercise, you’re going to use some simple promQL queries.
Try out these queries in the prometheus web interface at http://oob.srv1.campusX.ws.nsrc.org/prometheus
Don’t worry if you don’t have time to do them all - you can do the rest at your leisure. Ask your instructor if you’d like a particular query to be explained.
Remember that each timeseries in your database has the following form:
metricname{label_name=value pairs......} value
The most important part of promQL querying is to filter down, from the universe of timeseries which are available, to just the ones you are interested in.
The simplest query is to give just the metric name. Try this query:
up
and click “Execute”.
This gives you every timeseries whose metric name is “up” - regardless of what labels it has.
Try a typo:
upz
You don’t get an error - you just get an empty result set. “upz” is a valid metric name, and it just happens that your database doesn’t contain any timeseries with this metric name.
When you are in the GUI, there are two different tabs: “Console” and “Graph”. Start with the “Console” view, you should see something like this for your “up” query:
up{instance="bdr1.campusX.ws.nsrc.org",job="snmp",module="if_mib_v3"} 1
up{instance="core1.campusX.ws.nsrc.org",job="snmp",module="if_mib_v3"} 0
up{instance="localhost:9090",job="prometheus"} 1
up{instance="srv1.campusX.ws.nsrc.org",job="node"} 1
This has a special name: it’s called an instant vector. It’s a “vector” because there are multiple values in a column, and these values all apply to the same instant in time - the current time.
When you ask prometheus for the value of a metric at a particular instant in time, it gives you the most recent scrape value which occurred before that time.
Now switch to the “Graph” view. You will see the values of those metrics at different times over a time period. Prometheus does this by sweeping the time across a range of times, and drawing points corresponding to the instant vectors at each timestamp. Effectively it’s repeating the same query for different points of time in the past.
You can use the “+” and “-” buttons to change the time range covered.
Try the following queries:
up{job="snmp"}
up{instance="localhost:9090"}
These queries filter down the timeseries further, to those where:
You can also filter for timeseries where a label doesn’t have a given value:
up{job!="prometheus"}
Missing labels are treated as empty labels. So
up{wombat=""}
will match all timeseries which don’t have a label “wombat” (which should be everything), and
up{wombat!=""}
will give only those timeseries which do have a label “wombat” (which hopefully will show zero timeseries!)
Sometimes you want to match multiple values of a label. This can be done by using regular expression matching, which is a sort of pattern matching.
Try the following:
up{job=~"snmp|node"}
The vertical bar has a special meaning as “OR” in a regular expression: “match snmp
OR node
”
Now try this, substituting campusX
with your own campus:
up{instance=~".*campusX.*"}
This should show you all instances of up
which contain “campusX” anywhere inside the instance label. How does this work?
Dot has a special meaning which is “match any single character”. And star means “match the previous item zero or more times”.
So this matches any instance which:
There’s another way to filter down to a single metric. The metric name is really just a hidden, internal label called __name__
. So the following should give you the same results as the original “up” query:
{__name__="up"}
This is an unusual query to do, but it demonstrates the point, and it is occasionally useful: for example using regular expression matching you can query multiple metrics at once. Here you can ask prometheus for all the metrics which start with node_filesystem_
:
{__name__=~"node_filesystem_.*"}
If you are not careful this form of query can be dangerous, because it could match a large number of timeseries at once, making it very slow to execute and potentially running out of resources on your server. In any case, different metrics have different meanings - it would be unusual to mix together different metrics in the same query results.
Switch back to Console view, and run this query:
node_filesystem_free_bytes
It gives the unused size in each filesystem (it is conventional to include the units as part of the metric name, here _bytes
)
This is fine, but it’s not easy to read. What if you want it in gigabytes? That’s easy - promQL has arithmetic operators:
node_filesystem_free_bytes / 1000000000
A simple number like this, without any timeseries or labels, is called a “scalar”.
When you combine an instant vector with a scalar, the operation is performed on each element of the instant vector and gives you a new instant vector with the same labels. So you’ll have the same number of results as before, but with different values.
What if you want the percentage of disk space used?
There’s another metric giving the total size of filesystem. Execute this query:
node_filesystem_size_bytes
You should see the total size of each filesystem, with the same labels as before. Now you can do this:
node_filesystem_free_bytes / node_filesystem_size_bytes
You should get values between 0 and 1. If you find there are no results, it probably means you mistyped one of the metric names.
Can you modify this query so that it shows a percentage, rather than a value between 0 and 1?
NOTE: operations between instant vectors like this only work where the labels on the left-hand and right-hand sides match exactly, one to one. The resulting instant vector has the same set of labels.
If even one label is different, there’s no match and no result in the output. That’s why if you mistype even one metric name, you get no results: one side is empty, and there’s nothing to match to the other side.
If the label sets are almost the same, then there’s a way to ignore some of the labels when matching. See the documentation for more details.
Prometheus has a range of aggregation operators which operate across the timeseries in an instant vector.
These ones sort by value, ascending or descending:
sort (node_filesystem_free_bytes)
sort_desc (node_filesystem_free_bytes)
topk (3, node_filesystem_free_bytes)
The last one sorts by value and gives only the top 3 values. It can be used to answer queries such as, “which are my largest filesystems” or “which filesystems are closest to full?”
Try this:
max (node_filesystem_size_bytes)
Note that it gives an instant vector with a single value and an empty label set {}
. It’s the largest filesystem you have, across every filesystem.
These aggregations can be grouped on a given label. Try this:
max by (instance) (node_filesystem_size_bytes)
This gives the maximum filesystem size separately for each instance. So if you’re polling 5 servers, each of which has multiple filesystems, you’ll get 5 timeseries in the result, each labeled with the instance name, and containing the maximum filesystem size across the filesystems on that server only.
Sometimes, you’re only interested in the number of timeseries, rather than their values. How many filesystems do I have on each node?
count by (instance) (node_filesystem_size_bytes)
Here it doesn’t matter which metric we query, as long as there is one metric per timeseries. The result is the number of timeseries.
You can add up the values across timeseries using “sum”. Dividing by the number of timeseries you get the average:
sum (node_filesystem_size_bytes) / count (node_filesystem_size_bytes)
There is a built-in function to simplify this:
avg (node_filesystem_size_bytes)
Again: this gives the average across all the filesystems you are monitoring. If you want to see the average filesystem size per host then you can group the results based on the “instance” label:
avg by (instance) (node_filesystem_size_bytes)
Especially for alerting, you want to be able to filter on the value of a timeseries rather than its labels.
Try this:
node_filesystem_avail_bytes < 100000000
It will return an instant vector containing all filesystems whose available space is less than 100MB.
This is perfect for alerting. If the instant vector is empty, then there is no alert to send. But if there are multiple results in the instant vector, we can send out a single alert grouping together all the affected filesystems.
NOTE: the result of this query is NOT a boolean like “true” or “false”; it’s still the value of the metric. We’ve just filtered out all the timeseries which don’t meet the condition. However you can get a boolean result (0 or 1) by adding a “bool” modifier:
node_filesystem_avail_bytes < bool 100000000
Since the filtering of an instant vector just returns another instant vector, you can continue to filter it further. e.g. to get those timeseries whose value is between two values:
node_filesystem_avail_bytes > 100000000 < 200000000
# this is equivalent to:
(node_filesystem_avail_bytes > 100000000) < 200000000
There are further ways to select timeseries dependent on the presence of other timeseries. For example: suppose the node_filesystem_avail_bytes is 0, but because it’s a pseudo-filesystem, the node_filesystem_size_bytes is 0 as well. This would be a false alarm: this is a normal condition.
You can combine the conditions like this:
node_filesystem_avail_bytes < 100000000 and node_filesystem_size_bytes > 0
Due to operator precedence, this is interpreted as
(node_filesystem_avail_bytes < 100000000) and (node_filesystem_size_bytes > 0)
The “and” operator says: give the left-hand result only if there is matching timeseries (i.e. with the same set of labels) on the right hand side. The actual value of the right hand timeseries is ignored.
You can also invert the logic:
node_filesystem_avail_bytes < 100000000 unless node_filesystem_size_bytes == 0
Now the LHS timeseries will only be allowed through if there is not a matching timeseries on the RHS.
Remember: all these things are working on instant vectors, so a single expression can be working across many timeseries at once, building an instant vector of all the results.
Sometimes you are interested in performing queries over time. How does this value compare to how it was 10 minutes ago? How fast is it going up or down?
You may seen these briefly in a previous exercise. This is a partial query which returns a range vector:
node_network_transmit_bytes_total[5m]
NOTE: you cannot run this query in the promQL web interface, because it can only display instant vectors.
What it means is: “return all the data points for the timeseries with the node_network_transmit_bytes_total
metric over the last 5 minutes”. It is therefore two-dimensional: there are multiple timeseries, each with data points over time.
Here "*" represents a data point at a given time:
{label="X"} .*......*.....*.....*.....*.... ^
{label="Y"} ....*....*.....*.....*....*.... timeseries
{label="Z"} ...*.....*.....*.....*.....*... v
<------------ time ----------->
| |
5 mins ago now
(Note that data points aren’t necessarily collected at identical points in time, due to variations in scraping)
How can we use this, if we can’t graph it directly? We can perform aggregations over time. This query should work:
count_over_time(node_network_transmit_bytes_total[5m])
This gives the number of data points inside the given time window. What value do you see? Is it what you expect?
Hint: we are sampling data points at 1 minute intervals. So how many data points fit within a 5 minute window?
There are other aggregation over time functions, such as max/min/avg. For example, for a gauge like scrape_duration_seconds we can get the largest value over the last 5 minutes:
max_over_time(scrape_duration_seconds[5m])
But where it becomes interesting is using the “rate” and “irate” functions to convert counters into rates. Try these functions in the “graph” view:
rate(node_network_transmit_bytes_total[5m])
irate(node_network_transmit_bytes_total[5m])
Does one look “spikier” than the other?
The difference is how the rate is calculated. “rate” takes the difference between the first and the last data points in the selected time window, and therefore smooths over a longer time period:
{label="Z"} ...*.....*.....*.....*.....*... v
<--------- rate -------->
Whereas “irate” (instantaneous rate) takes the difference between the last two data points in the selected time window:
{label="Z"} ...*.....*.....*.....*.....*... v
<----->
irate
Given this knowledge, can you explain the results you get from the following queries? Look at them in the “graph” view again.
rate(node_network_transmit_bytes_total[1m])
rate(node_network_transmit_bytes_total[70s])
rate(node_network_transmit_bytes_total[90s])
Hint: remember that you need two data points in order to calculate a rate.
You can use the command line to access the prometheus API in the same way as the GUI. The simple instant vector query “up” is done like this:
/opt/prometheus/promtool query instant http://localhost:9090/prometheus up
You will get one result per timeseries, giving the current (most recent) value of the metric. To sweep this over a range:
/opt/prometheus/promtool query range http://localhost:9090/prometheus up --step=30s
This gives you the values of the metric over the last 5 minutes, sampled at 30 second intervals. (You can supply --start
and --end
options to specify absolute time start and end)