NSRC / SANOG37
# PACNOG29: Scalable Network Monitoring and Management Tutorial #
  • Dates: 8-9 Dec 2021
  • Location: Virtual
  • Host: PITA
  • Partners: APNIC, ICANN, NSRC
  • Timezone: UTC+12
Group Photo NMM      
## Tutorial ##
  • [Detailed Agenda](agenda.html)
  • [Tutorial Description and Summary](description.html)
 
## Tutorial Topics ## * Scalable Network Monitoring and Management. Present day practices and how they differ from classical monitoring * Examples of tools used, traditional and modern, some of the tools we'll discuss and/or demo include: * Accessing our platform * SmokePing * Nagios * LibreNMS * NetFlow-NG * Working with the Prometheus Stack and other Tools * Use of PromQL * Using Grafana with Prometheus * Using AlertManager with Prometheus * Philosophy of Scalable Monitoring and Management version: * The fundamental distinction between "logs" and "metrics" * Concept of labels for timeseries * Counters versus gauges. * Scalable NMM Tools separate out: * data collection * data storage and querying (optimised either for timeseries storage or for log indexing/querying) * data visualisation * alerting * Tools such as Elastalert, AlertManager * Issue of too many alerts, which is very bad * ElastiFlow for network flows * Grafana Loki for logging data ## Useful Links ## * [Instructors](instructors.html) * [Reference Materials](references.html) ## Diagrams ## * [Server and Network Device Access by Group](diagrams/nmm-groups-diagram-oob.png) * [Class Topology Overview](diagrams/nmm-topology-overview-high-level.png) * [Campus 1 Detailed](diagrams/nmm-campus1-detailed.png) * [Campus 1 Detailed with Addresses](diagrams/nmm-campus1-detailed-with-addresses.png)