Platforms (Datadog/Splunk/PagerDuty) Intermediate

Three leading platforms offer strong AIOps capabilities for network operations. This lesson provides hands-on guidance for configuring each platform's AI features for networking use cases.

Datadog for Network AIOps

  • Watchdog — Automatic anomaly detection across all metrics. No configuration needed — it learns patterns and alerts on deviations
  • Network Performance Monitoring (NPM) — Flow-level visibility with AI-powered analysis of network paths
  • Network Device Monitoring — SNMP-based monitoring with anomaly detection for network devices
  • Anomaly Monitors — Create monitors that use ML algorithms (agile, robust, adaptive) instead of static thresholds
  • Forecast Monitors — Alert when a metric is predicted to cross a threshold in the future

Splunk ITSI

  • Service Models — Define services with KPIs that map to network infrastructure health
  • Adaptive Thresholding — ML-based thresholds that adapt to daily and weekly patterns
  • Event Analytics — Correlation of events across multiple data sources into notable events
  • Predictive Analytics — MLTK (Machine Learning Toolkit) for custom network analytics
  • Glass Tables — Visual service health dashboards with real-time KPI status

PagerDuty AIOps

  • Intelligent Alert Grouping — ML automatically groups related alerts into a single incident
  • Noise Reduction — Suppress transient alerts and deduplicate across integration sources
  • Past Incidents — Surface similar historical incidents for context during active response
  • Event Intelligence — ML-based triage, priority scoring, and routing recommendations
  • Automation Actions — Trigger diagnostic runbooks and remediation workflows from incidents

Platform Comparison

CapabilityDatadogSplunk ITSIPagerDuty
Best ForCloud-native, full-stack monitoringOn-prem, log-heavy environmentsIncident management, on-call
Anomaly DetectionBuilt-in (Watchdog)MLTK add-onEvent Intelligence
Network VisibilityNPM + NDMNetwork monitoring add-onsVia integrations
AutomationWorkflow AutomationSOAR integrationAutomation Actions
Pricing ModelPer host/metricPer GB ingestedPer user
Multi-Platform Strategy: Many organizations use multiple platforms together. For example, Datadog for monitoring + PagerDuty for incident management. Use APIs and webhooks to integrate them into a cohesive AIOps pipeline.

Next Step

Learn the best practices for building an AIOps culture and measuring success.

Next: Best Practices →