Security Principles for ML
Lesson 3 of 7 in the AI Security Fundamentals course.
Applying Security Principles to Machine Learning
Machine learning systems require the same foundational security principles as traditional software, but applied in ways that account for the unique properties of data-driven systems. In this lesson, we explore how classic security principles map to ML contexts and what additional principles are needed.
Least Privilege for ML Systems
The principle of least privilege states that every component should have only the minimum permissions necessary to perform its function. In ML systems, this applies at multiple levels:
- Data access: Training pipelines should only access the specific datasets they need, not the entire data lake
- Model access: Serving infrastructure should have read-only access to model artifacts, never write access
- API permissions: Model consumers should be scoped to specific models and rate limits, not blanket access
- Infrastructure: GPU clusters should be isolated from general-purpose compute and have restricted network access
# Example: Implementing least-privilege model serving with role-based access
from functools import wraps
class ModelAccessControl:
"""Role-based access control for ML model serving."""
ROLES = {
"reader": ["predict", "explain"],
"analyst": ["predict", "explain", "evaluate", "inspect_features"],
"admin": ["predict", "explain", "evaluate", "inspect_features",
"update_model", "view_weights", "export"]
}
def __init__(self):
self.user_roles = {}
self.audit_log = []
def assign_role(self, user_id, role):
if role not in self.ROLES:
raise ValueError(f"Unknown role: {role}")
self.user_roles[user_id] = role
def check_permission(self, user_id, action):
role = self.user_roles.get(user_id)
if not role:
self._log(user_id, action, "DENIED - no role assigned")
return False
allowed = action in self.ROLES[role]
status = "ALLOWED" if allowed else "DENIED"
self._log(user_id, action, status)
return allowed
def _log(self, user_id, action, status):
self.audit_log.append({
"user": user_id, "action": action, "status": status
})
# Usage
acl = ModelAccessControl()
acl.assign_role("api-consumer-1", "reader")
acl.assign_role("data-scientist-1", "analyst")
print(acl.check_permission("api-consumer-1", "predict")) # True
print(acl.check_permission("api-consumer-1", "view_weights")) # False
print(acl.check_permission("data-scientist-1", "evaluate")) # True
Defense in Depth for ML
No single security control is sufficient. ML systems need layered defenses:
- Perimeter: API gateway with authentication, rate limiting, and IP allowlisting
- Input layer: Schema validation, anomaly detection, and adversarial input filtering
- Model layer: Adversarial training, robust architectures, and ensemble methods
- Output layer: Output sanitization, confidence thresholds, and human-in-the-loop for critical decisions
- Monitoring layer: Drift detection, anomalous query patterns, and performance degradation alerts
Separation of Duties
In ML systems, separation of duties means that no single person or process should control the entire pipeline from data to deployment:
- Data engineers manage data pipelines but do not deploy models
- Data scientists train and evaluate models but do not have production infrastructure access
- ML engineers handle deployment but cannot modify training data
- Security teams audit all stages but do not participate in model development
Secure Defaults
ML systems should be secure by default, requiring explicit action to reduce security rather than to add it:
- Model APIs should require authentication by default
- Training data should be encrypted at rest by default
- Model outputs should include confidence scores by default to enable downstream filtering
- Logging should be enabled by default for all model interactions
- Input validation should reject unexpected formats by default
Fail Securely
When ML systems fail, they should fail in a way that does not compromise security:
- If input validation fails, reject the input rather than passing it to the model
- If the model returns low confidence, fall back to a safe default rather than guessing
- If monitoring detects anomalous behavior, alert and optionally degrade service rather than continuing silently
- Error messages should not reveal model architecture, version, or internal details
Implementing Secure Failure
class SecureModelServer:
"""Model server with secure failure modes."""
def __init__(self, model, confidence_threshold=0.7):
self.model = model
self.confidence_threshold = confidence_threshold
self.fallback_response = {"result": "uncertain", "action": "manual_review"}
def predict(self, input_data):
try:
# Validate input
if not self._validate_input(input_data):
return {"error": "Invalid input format", "code": 400}
# Run prediction
result = self.model.predict(input_data)
confidence = float(result.get("confidence", 0))
# Check confidence threshold
if confidence < self.confidence_threshold:
self._log_low_confidence(input_data, confidence)
return self.fallback_response
return {"result": result["prediction"], "confidence": confidence}
except Exception:
# Never expose internal errors
self._log_error(input_data)
return {"error": "Service temporarily unavailable", "code": 503}
def _validate_input(self, data):
# Strict schema validation
return isinstance(data, dict) and "features" in data
def _log_low_confidence(self, data, confidence):
# Log for monitoring without exposing to user
pass
def _log_error(self, data):
# Log full error internally for debugging
pass
Complete Mediation
Every access to every ML resource must be checked for authorization. This means:
- Every API call is authenticated and authorized, even internal service-to-service calls
- Every data access is logged and auditable
- Model artifacts are verified for integrity before loading
- Feature stores enforce access controls on individual features
Summary
Security principles for ML systems build on established software security foundations but require careful adaptation. The key principles — least privilege, defense in depth, separation of duties, secure defaults, fail securely, and complete mediation — each take on new dimensions when applied to data-driven systems. In the next lesson, we will perform a detailed attack surface analysis to identify where these principles must be applied.
Lilly Tech Systems