Advanced

Monitoring

Monitoring LLM applications requires tracking dimensions that traditional observability tools were never designed for — output quality, safety compliance, injection attempts, and data leakage in real time.

What to Monitor

Category	Metrics	Alert Threshold
Security	Injection attempt rate, blocked requests, safety filter triggers	Spike above 2x baseline
Quality	Response relevance, coherence scores, user satisfaction	Drop below acceptable threshold
Safety	Toxicity scores, PII detections, policy violations	Any critical violation
Performance	Latency p95/p99, token throughput, error rates	SLA breach thresholds
Usage	Request volume, token consumption, cost per request	Budget threshold exceeded
Drift	Input distribution shift, output distribution change	Statistical significance test

Audit Logging

Every LLM interaction should be logged with sufficient detail for security investigation and compliance:

# Comprehensive LLM audit log entry
audit_entry = {
    "timestamp": "2026-03-15T14:23:01Z",
    "request_id": "req_abc123",
    "user_id": "user_456",
    "session_id": "sess_789",
    "model": "gpt-4-turbo",
    "input": {
        "system_prompt_hash": "sha256:a1b2c3...",
        "user_message_length": 342,
        "detected_patterns": ["none"],
        "input_language": "en"
    },
    "output": {
        "response_length": 856,
        "toxicity_score": 0.02,
        "pii_detected": False,
        "safety_flags": [],
        "tools_invoked": ["search_knowledge_base"]
    },
    "metadata": {
        "latency_ms": 1250,
        "tokens_input": 420,
        "tokens_output": 215,
        "model_version": "2026-02-15"
    }
}

Real-Time Security Alerting

Injection Detection Alerts

Monitor for patterns indicative of prompt injection attempts. Alert when the injection attempt rate exceeds baseline or when novel attack patterns are detected by ML classifiers.
Data Leakage Alerts

Scan all outputs for PII, system prompt fragments, API keys, and other sensitive data patterns. Immediate alert on any detection with automatic redaction.
Anomalous Behavior Alerts

Detect unusual patterns such as a single user making many rapid requests, responses that are unusually long, or tool invocations that deviate from normal patterns.
Safety Violation Alerts

Trigger on any output that scores above toxicity thresholds, contains policy violations, or matches known harmful content patterns. Escalate immediately to the IR team.

Monitoring Architecture

Inline Monitoring

Security checks that run synchronously in the request path. Higher latency impact but can block harmful responses before they reach users.

Async Monitoring

Analysis that runs asynchronously after the response is delivered. No latency impact but cannot prevent harmful outputs — only detect and alert.

Batch Analysis

Periodic analysis of accumulated logs to detect trends, slow-moving attacks, and patterns that are only visible in aggregate.

Red Team Automation

Continuous automated red teaming that probes the production system with known attack patterns to verify defenses remain effective.

💡

Looking Ahead: In the final lesson, we will consolidate everything into a comprehensive best practices guide covering security-first architecture, defense checklists, and building a security culture for AI teams.

← PreviousAgent Security Next →Best Practices

Monitoring

What to Monitor

Audit Logging

Real-Time Security Alerting

Injection Detection Alerts

Data Leakage Alerts

Anomalous Behavior Alerts

Safety Violation Alerts

Monitoring Architecture

Inline Monitoring

Async Monitoring

Batch Analysis

Red Team Automation