Security Principles for ML

Lesson 3 of 7 in the AI Security Fundamentals course.

Applying Security Principles to Machine Learning

Machine learning systems require the same foundational security principles as traditional software, but applied in ways that account for the unique properties of data-driven systems. In this lesson, we explore how classic security principles map to ML contexts and what additional principles are needed.

Least Privilege for ML Systems

The principle of least privilege states that every component should have only the minimum permissions necessary to perform its function. In ML systems, this applies at multiple levels:

Data access: Training pipelines should only access the specific datasets they need, not the entire data lake
Model access: Serving infrastructure should have read-only access to model artifacts, never write access
API permissions: Model consumers should be scoped to specific models and rate limits, not blanket access
Infrastructure: GPU clusters should be isolated from general-purpose compute and have restricted network access

Python

# Example: Implementing least-privilege model serving with role-based access
from functools import wraps

class ModelAccessControl:
    """Role-based access control for ML model serving."""

    ROLES = {
        "reader": ["predict", "explain"],
        "analyst": ["predict", "explain", "evaluate", "inspect_features"],
        "admin": ["predict", "explain", "evaluate", "inspect_features",
                   "update_model", "view_weights", "export"]
    }

    def __init__(self):
        self.user_roles = {}
        self.audit_log = []

    def assign_role(self, user_id, role):
        if role not in self.ROLES:
            raise ValueError(f"Unknown role: {role}")
        self.user_roles[user_id] = role

    def check_permission(self, user_id, action):
        role = self.user_roles.get(user_id)
        if not role:
            self._log(user_id, action, "DENIED - no role assigned")
            return False
        allowed = action in self.ROLES[role]
        status = "ALLOWED" if allowed else "DENIED"
        self._log(user_id, action, status)
        return allowed

    def _log(self, user_id, action, status):
        self.audit_log.append({
            "user": user_id, "action": action, "status": status
        })

# Usage
acl = ModelAccessControl()
acl.assign_role("api-consumer-1", "reader")
acl.assign_role("data-scientist-1", "analyst")

print(acl.check_permission("api-consumer-1", "predict"))       # True
print(acl.check_permission("api-consumer-1", "view_weights"))   # False
print(acl.check_permission("data-scientist-1", "evaluate"))     # True

Defense in Depth for ML

No single security control is sufficient. ML systems need layered defenses:

Perimeter: API gateway with authentication, rate limiting, and IP allowlisting
Input layer: Schema validation, anomaly detection, and adversarial input filtering
Model layer: Adversarial training, robust architectures, and ensemble methods
Output layer: Output sanitization, confidence thresholds, and human-in-the-loop for critical decisions
Monitoring layer: Drift detection, anomalous query patterns, and performance degradation alerts

💡

Best practice: Implement at least three independent layers of defense for any production ML system. If one layer fails, the others should still provide meaningful protection.

Separation of Duties

In ML systems, separation of duties means that no single person or process should control the entire pipeline from data to deployment:

Data engineers manage data pipelines but do not deploy models
Data scientists train and evaluate models but do not have production infrastructure access
ML engineers handle deployment but cannot modify training data
Security teams audit all stages but do not participate in model development

Secure Defaults

ML systems should be secure by default, requiring explicit action to reduce security rather than to add it:

Model APIs should require authentication by default
Training data should be encrypted at rest by default
Model outputs should include confidence scores by default to enable downstream filtering
Logging should be enabled by default for all model interactions
Input validation should reject unexpected formats by default

Fail Securely

When ML systems fail, they should fail in a way that does not compromise security:

If input validation fails, reject the input rather than passing it to the model
If the model returns low confidence, fall back to a safe default rather than guessing
If monitoring detects anomalous behavior, alert and optionally degrade service rather than continuing silently
Error messages should not reveal model architecture, version, or internal details

Implementing Secure Failure

Python

class SecureModelServer:
    """Model server with secure failure modes."""

    def __init__(self, model, confidence_threshold=0.7):
        self.model = model
        self.confidence_threshold = confidence_threshold
        self.fallback_response = {"result": "uncertain", "action": "manual_review"}

    def predict(self, input_data):
        try:
            # Validate input
            if not self._validate_input(input_data):
                return {"error": "Invalid input format", "code": 400}

            # Run prediction
            result = self.model.predict(input_data)
            confidence = float(result.get("confidence", 0))

            # Check confidence threshold
            if confidence < self.confidence_threshold:
                self._log_low_confidence(input_data, confidence)
                return self.fallback_response

            return {"result": result["prediction"], "confidence": confidence}

        except Exception:
            # Never expose internal errors
            self._log_error(input_data)
            return {"error": "Service temporarily unavailable", "code": 503}

    def _validate_input(self, data):
        # Strict schema validation
        return isinstance(data, dict) and "features" in data

    def _log_low_confidence(self, data, confidence):
        # Log for monitoring without exposing to user
        pass

    def _log_error(self, data):
        # Log full error internally for debugging
        pass

Complete Mediation

Every access to every ML resource must be checked for authorization. This means:

Every API call is authenticated and authorized, even internal service-to-service calls
Every data access is logged and auditable
Model artifacts are verified for integrity before loading
Feature stores enforce access controls on individual features

⚠

Common mistake: Many organizations secure their model API endpoints but leave internal communication between ML services completely unprotected. An attacker who compromises any service in the pipeline can then access all others.

Summary

Security principles for ML systems build on established software security foundations but require careful adaptation. The key principles — least privilege, defense in depth, separation of duties, secure defaults, fail securely, and complete mediation — each take on new dimensions when applied to data-driven systems. In the next lesson, we will perform a detailed attack surface analysis to identify where these principles must be applied.

← PreviousThreat Landscape for AI Next →Attack Surface Analysis