NSRC / PACNOG29 Scalable Network Monitoring and Management Tutorial

Dates: 8-9 Dec 2021
Location: Virtual
Host: PITA
Partners: APNIC , ICANN , NSRC
Timezone: UTC+12

# PACNOG29: Scalable Network Monitoring and Management Tutorial #
Dates: 8-9 Dec 2021 Location: Virtual Host: PITA Partners: APNIC, ICANN, NSRC Timezone: UTC+12

## Tutorial ##
[Detailed Agenda](agenda.html) [Tutorial Description and Summary](description.html)

## Tutorial Topics ## * Scalable Network Monitoring and Management. Present day practices and how they differ from classical monitoring * Examples of tools used, traditional and modern, some of the tools we'll discuss and/or demo include: * Accessing our platform * SmokePing * Nagios * LibreNMS * NetFlow-NG * Working with the Prometheus Stack and other Tools * Use of PromQL * Using Grafana with Prometheus * Using AlertManager with Prometheus * Philosophy of Scalable Monitoring and Management version: * The fundamental distinction between "logs" and "metrics" * Concept of labels for timeseries * Counters versus gauges. * Scalable NMM Tools separate out: * data collection * data storage and querying (optimised either for timeseries storage or for log indexing/querying) * data visualisation * alerting * Tools such as Elastalert, AlertManager * Issue of too many alerts, which is very bad * ElastiFlow for network flows * Grafana Loki for logging data ## Useful Links ## * [Instructors](instructors.html) * [Reference Materials](references.html) ## Diagrams ## * [Server and Network Device Access by Group](diagrams/nmm-groups-diagram-oob.png) * [Class Topology Overview](diagrams/nmm-topology-overview-high-level.png) * [Campus 1 Detailed](diagrams/nmm-campus1-detailed.png) * [Campus 1 Detailed with Addresses](diagrams/nmm-campus1-detailed-with-addresses.png)