Advanced

Monitoring

Monitoring LLM applications requires tracking dimensions that traditional observability tools were never designed for — output quality, safety compliance, injection attempts, and data leakage in real time.

What to Monitor

Category Metrics Alert Threshold
Security Injection attempt rate, blocked requests, safety filter triggers Spike above 2x baseline
Quality Response relevance, coherence scores, user satisfaction Drop below acceptable threshold
Safety Toxicity scores, PII detections, policy violations Any critical violation
Performance Latency p95/p99, token throughput, error rates SLA breach thresholds
Usage Request volume, token consumption, cost per request Budget threshold exceeded
Drift Input distribution shift, output distribution change Statistical significance test

Audit Logging

Every LLM interaction should be logged with sufficient detail for security investigation and compliance:

# Comprehensive LLM audit log entry
audit_entry = {
    "timestamp": "2026-03-15T14:23:01Z",
    "request_id": "req_abc123",
    "user_id": "user_456",
    "session_id": "sess_789",
    "model": "gpt-4-turbo",
    "input": {
        "system_prompt_hash": "sha256:a1b2c3...",
        "user_message_length": 342,
        "detected_patterns": ["none"],
        "input_language": "en"
    },
    "output": {
        "response_length": 856,
        "toxicity_score": 0.02,
        "pii_detected": False,
        "safety_flags": [],
        "tools_invoked": ["search_knowledge_base"]
    },
    "metadata": {
        "latency_ms": 1250,
        "tokens_input": 420,
        "tokens_output": 215,
        "model_version": "2026-02-15"
    }
}

Real-Time Security Alerting

  1. Injection Detection Alerts

    Monitor for patterns indicative of prompt injection attempts. Alert when the injection attempt rate exceeds baseline or when novel attack patterns are detected by ML classifiers.

  2. Data Leakage Alerts

    Scan all outputs for PII, system prompt fragments, API keys, and other sensitive data patterns. Immediate alert on any detection with automatic redaction.

  3. Anomalous Behavior Alerts

    Detect unusual patterns such as a single user making many rapid requests, responses that are unusually long, or tool invocations that deviate from normal patterns.

  4. Safety Violation Alerts

    Trigger on any output that scores above toxicity thresholds, contains policy violations, or matches known harmful content patterns. Escalate immediately to the IR team.

Monitoring Architecture

Inline Monitoring

Security checks that run synchronously in the request path. Higher latency impact but can block harmful responses before they reach users.

Async Monitoring

Analysis that runs asynchronously after the response is delivered. No latency impact but cannot prevent harmful outputs — only detect and alert.

Batch Analysis

Periodic analysis of accumulated logs to detect trends, slow-moving attacks, and patterns that are only visible in aggregate.

Red Team Automation

Continuous automated red teaming that probes the production system with known attack patterns to verify defenses remain effective.

💡
Looking Ahead: In the final lesson, we will consolidate everything into a comprehensive best practices guide covering security-first architecture, defense checklists, and building a security culture for AI teams.