Beginner

Introduction to Cloud AI Security

Cloud platforms are the primary infrastructure for AI workloads. Understanding the shared responsibility model, unique attack surfaces, and security controls for cloud AI services is essential.

The Shared Responsibility Model for AI

Every major cloud provider operates under a shared responsibility model. For AI workloads, responsibilities shift depending on the service tier:

Service Tier Cloud Provider Responsibility Customer Responsibility
IaaS (GPU VMs) Physical infrastructure, hypervisor, network OS, runtime, ML frameworks, data, models, access control
PaaS (SageMaker, Vertex AI) Infrastructure, OS, runtime, ML platform Data, models, access control, endpoint configuration
SaaS (Bedrock, AI APIs) Everything except customer data and access Data sent to APIs, access control, usage policies

Unique Risks of Cloud AI

  • Data exposure through AI APIs: Training data, prompts, and model outputs may be logged, cached, or used for service improvement unless explicitly opted out
  • Model endpoint exposure: Misconfigured inference endpoints can be accessed by unauthorized users, enabling model theft or abuse
  • Cross-tenant risks: Shared GPU infrastructure may expose side-channel attacks between tenants on multi-tenant AI platforms
  • Cost-based attacks: Adversaries can trigger expensive GPU training or inference jobs through compromised credentials, leading to massive bills
  • Data residency violations: AI services may process data in regions that violate regulatory requirements without explicit region configuration

Core Security Pillars

Identity & Access Management

Least-privilege IAM policies for ML services, service accounts for pipelines, temporary credentials for training jobs, and cross-service permission boundaries.

Network Security

VPC endpoints for AI services, private link connections, network segmentation for training and inference, and firewall rules for ML API endpoints.

Data Protection

Encryption at rest for training data and model artifacts, encryption in transit for all AI API calls, and key management with customer-managed keys.

Monitoring & Audit

Comprehensive logging of all ML operations, real-time alerting on suspicious activity, and audit trails for compliance and forensic investigation.

Common Mistake: Many teams deploy AI services with default configurations that are optimized for ease of use, not security. Default SageMaker endpoints are publicly accessible. Default Vertex AI notebooks allow root access. Always review and harden service configurations before deploying.
💡
Next Up: In the next lesson, we dive deep into AWS AI Security — SageMaker hardening, IAM for Bedrock, VPC endpoints, and CloudTrail configuration for ML operations.