% Monitoring disk stats with Cacti

# Disk space utilisation

You need to be aware of two MIBs which can be used for monitoring disk
space.

## hrStorageTable (.1.3.6.1.2.1.25.2.3)

This comes from HOST-RESOURCES-MIB and is the "standard" way of returning
disk space utilisation.  It is supported by many vendors, including Windows
SNMP Agent, and enabled by default in Linux net-snmp as well.

It is limited to returning 2^31 filesystem blocks, which means you can have
problems when monitoring large filesystems (e.g.  an 8TB filesystem with 4KB
blocks)

Use snmpwalk to look at it. You will see entries for various things
including RAM utilisation, but the mounted partition(s) will be included
as members of type hrStorageFixedDisk.

~~~~
$ snmpwalk -v2c -c NetManage localhost hrStorageTable
...
HOST-RESOURCES-MIB::hrStorageIndex.31 = INTEGER: 31
...
HOST-RESOURCES-MIB::hrStorageType.31 = OID: HOST-RESOURCES-TYPES::hrStorageFixedDisk
...
HOST-RESOURCES-MIB::hrStorageDescr.31 = STRING: /
...
HOST-RESOURCES-MIB::hrStorageAllocationUnits.31 = INTEGER: 4096 Bytes
...
HOST-RESOURCES-MIB::hrStorageSize.31 = INTEGER: 4934317
...
HOST-RESOURCES-MIB::hrStorageUsed.31 = INTEGER: 232406
~~~~

## dskTable (.1.3.6.1.4.1.2021.9)

This MIB is proprietary to the net-snmp daemon which runs under Linux/Unix. 
It is a convenient alternative when monitoring a Linux/Unix box if there are
problems with hrStorageTable.  It also can have problems with very large
filesystems, except for very recent net-snmp versions.

If you want to enable this MIB you need to list all the filesystem mount
point(s) you want to monitor in `snmpd.conf`

~~~~
$ sudo editor /etc/snmp/snmpd.conf
... add this line somewhere ...
disk /

$ sudo service snmpd restart
$ snmpwalk -v2c -c NetManage localhost dskTable
UCD-SNMP-MIB::dskIndex.1 = 1
UCD-SNMP-MIB::dskPath.1 = /
... etc
~~~~


# Cacti configuration

To make things complicated, Cacti has several different data queries
pre-defined.

* "SNMP - Get Mounted Partitions" fetches *hrStorageTable* using a PHP
  script `/usr/share/cacti/site/scripts/ss_host_disk.php`
* "ucd/net - Get Monitored Partitions" fetches *dskTable* using a direct
  SNMP query
* "Unix - Get Mounted Partitions" should be ignored. It doesn't use SNMP
  at all, but looks directly at the local host.

You may get one or other of these data queries automatically added for you
based on which Host Template you chose when creating a host. You can always
add another query if you wish, and/or remove the one you don't want.

## How to collect hrStorageTable from a device

* From the left panel select "Devices"
* If you are creating a new device, enter the name/hostname and SNMP
  settings as usual, and click "Create"
* Under "Associated Data Queries", next to "Add Data Query" select
  "SNMP - Get Mounted Partitions", and click Add.
* Click on "verbose query" next to query you just added, to test the
  execution of the SNMP query.

If you see "Success [0 Items, 0 Rows]" then there is a problem. Find it and
fix it, and click on "verbose query" again until you see something like
"Success [24 Items, 8 Rows]"

## Create graphs

Now go to "Create Graphs for this Host". Under "Data Query [SNMP - 
Get Mounted Partitions]", check the box(es) next to the partitions you want
to graph, and then click Create at the bottom of the page.

If this is a new host, add it into a graph tree.


# Monitoring disk I/O operations

You can use Cacti to monitor disk I/O (that is, read/write transactions per
second and bytes per second).  This MIB is available in recent versions of
Linux snmpd.

However, out-of-the-box Cacti does not have the data query for this, so you
need to install a new data query and graph templates.  This can be done on a
standard Cacti installation - it does *not* require the Cacti Plugin
Architecture.

## Download the configuration

Firstly, go to http://docs.cacti.net/usertemplate:data:host_mib:diskio and
download the file with a name like 'diskio087d.tar.gz', and extract the two
XML files it contains.  You could do this under Linux like this:

    $ wget http://docs.cacti.net/_media/usertemplate:data:host_mib:diskio087d.tar.gz
    $ gzip -dc usertemplate:data:host_mib:diskio087d.tar.gz | tar -xvf -

This should give you two files:

* `disk_io.xml`
* `cacti087d_data_query_snmp_-_get_disk_io.xml`

## Install the configuration files

`disk_io.xml` needs to be installed in the Linux box in the correct
directory:

    $ sudo cp disk_io.xml /usr/share/cacti/resource/snmp_queries/disk_io.xml

The other file needs to be installed via the web interface. Login to Cacti
via the web browser, click "Import Templates". Next to "Import Template from
Local File" click "Choose"; select the file; then click "Save"

Note: this means that you'll have to have the file
`cacti087d_data_query_snmp_-_get_disk_io.xml` on your PC or laptop.  You can
copy it to your laptop using something like Putty PSFTP (or another Windows
scp or sftp client).  Alternatively, just download the original .tar.gz file
to your laptop and unpack it again there.

You should see "Cacti has imported the following items..."

Go to Data Queries and you should see "SNMP - Get Disk IO" as a new entry.

## Switch to 64-bit counters

Unfortunately, using 32-bit counters for disk I/O means that with 5-minute
polling, you graphs will break if you exceed about 14MB/sec of disk
activity.

Modern versions of net-snmp support 64-bit counters. You can check this by
doing:

~~~
snmpwalk -v2c -c <community> <target> UCD-DISKIO-MIB::diskIOTable
~~~

If you see `diskIONReadX` and `diskIONWrittenX` (note the final "X") then
you have extended or 64-bit counters.

To use them, edit `/usr/share/cacti/resource/snmp_queries/disk_io.xml` and
change the OIDs as follows:

~~~
...
<hrDiskIONRead>
        <name>Number of Bytes Read</name>
        <method>walk</method>
        <source>value</source>
        <direction>output</direction>
        <oid>.1.3.6.1.4.1.2021.13.15.1.1.3</oid>   ## change final .3 to .12
</hrDiskIONRead>
<hrDiskIONWrite>
        <name>Number of Bytes Written</name>
        <method>walk</method>
        <source>value</source>
        <direction>output</direction>
        <oid>.1.3.6.1.4.1.2021.13.15.1.1.4</oid>   ## change final .4 to .13
</hrDiskIONWrite>
...
~~~

## Start monitoring

* From the left pane select "Devices", then from the main screen click on
a device
* Under "Associated Data Queries", next to `Add Data Query` select
"SNMP - Get Disk IO" and click "Add"
* Go to "Create Graphs for this host"
* Under "Data Query [SNMP - Get Disk IO]" check the disks and/or
partitions you want to monitor
* Select one of the graph types from the dropdown below:
    * Host MIB - Disk IO - Bytes per second
    * Host MIB - Disk IO - Transactions
* At the bottom of the screen click on "Create"

If you want to monitor both Bytes per second and Transactions per second,
then repeat the process to create graphs of the other type.

## Interpreting the results

It is helpful to have a good idea how many operations per second you can
expect your drives to be able to handle. There is a good background article
here:
<http://www.symantec.com/connect/articles/getting-hang-iops>

