Platforms (Datadog/Splunk/PagerDuty) Intermediate
Three leading platforms offer strong AIOps capabilities for network operations. This lesson provides hands-on guidance for configuring each platform's AI features for networking use cases.
Datadog for Network AIOps
- Watchdog — Automatic anomaly detection across all metrics. No configuration needed — it learns patterns and alerts on deviations
- Network Performance Monitoring (NPM) — Flow-level visibility with AI-powered analysis of network paths
- Network Device Monitoring — SNMP-based monitoring with anomaly detection for network devices
- Anomaly Monitors — Create monitors that use ML algorithms (agile, robust, adaptive) instead of static thresholds
- Forecast Monitors — Alert when a metric is predicted to cross a threshold in the future
Splunk ITSI
- Service Models — Define services with KPIs that map to network infrastructure health
- Adaptive Thresholding — ML-based thresholds that adapt to daily and weekly patterns
- Event Analytics — Correlation of events across multiple data sources into notable events
- Predictive Analytics — MLTK (Machine Learning Toolkit) for custom network analytics
- Glass Tables — Visual service health dashboards with real-time KPI status
PagerDuty AIOps
- Intelligent Alert Grouping — ML automatically groups related alerts into a single incident
- Noise Reduction — Suppress transient alerts and deduplicate across integration sources
- Past Incidents — Surface similar historical incidents for context during active response
- Event Intelligence — ML-based triage, priority scoring, and routing recommendations
- Automation Actions — Trigger diagnostic runbooks and remediation workflows from incidents
Platform Comparison
| Capability | Datadog | Splunk ITSI | PagerDuty |
|---|---|---|---|
| Best For | Cloud-native, full-stack monitoring | On-prem, log-heavy environments | Incident management, on-call |
| Anomaly Detection | Built-in (Watchdog) | MLTK add-on | Event Intelligence |
| Network Visibility | NPM + NDM | Network monitoring add-ons | Via integrations |
| Automation | Workflow Automation | SOAR integration | Automation Actions |
| Pricing Model | Per host/metric | Per GB ingested | Per user |
Multi-Platform Strategy: Many organizations use multiple platforms together. For example, Datadog for monitoring + PagerDuty for incident management. Use APIs and webhooks to integrate them into a cohesive AIOps pipeline.
Next Step
Learn the best practices for building an AIOps culture and measuring success.
Next: Best Practices →
Lilly Tech Systems