Intermediate

AWS AI Security

AWS offers the broadest set of AI/ML services. Securing SageMaker, Bedrock, and other AWS AI services requires understanding IAM policies, VPC configuration, encryption, and audit logging.

SageMaker Security Hardening

VPC-Only Mode

Deploy SageMaker notebooks, training jobs, and endpoints within a VPC. Disable direct internet access and route all traffic through VPC endpoints or NAT gateways. This prevents data exfiltration and unauthorized API access.
Execution Role Scoping

Create dedicated IAM execution roles for each SageMaker component. Training roles need S3 access to training data. Endpoint roles need only model artifact access. Never reuse a single broad role across all components.
Network Isolation

Enable network isolation for training jobs and processing jobs. This prevents containers from making outbound network calls, eliminating the risk of data exfiltration during training.
Notebook Security

Disable root access on SageMaker notebooks. Use lifecycle configurations to enforce security policies. Enable encryption for notebook storage volumes using KMS customer-managed keys.

IAM for AWS AI Services

Service	Key IAM Actions to Restrict	Recommended Boundary
SageMaker	CreateEndpoint, CreateTrainingJob, CreateNotebookInstance	Permission boundary per team/project
Bedrock	InvokeModel, CreateModelCustomizationJob	Model-specific policies, deny access to restricted models
Comprehend	DetectPiiEntities, StartEntitiesDetectionJob	Data classification-based access
Rekognition	DetectFaces, SearchFacesByImage	Restrict facial recognition to approved use cases

Encryption for AWS AI

S3 bucket encryption: Enable SSE-KMS with customer-managed keys for all buckets containing training data, model artifacts, and pipeline outputs
EBS volume encryption: Encrypt all SageMaker instance storage volumes. Use KMS keys with key policies that restrict access to authorized roles only
Inter-container encryption: Enable inter-container traffic encryption for distributed training jobs to protect gradient data in transit
Endpoint encryption: All SageMaker endpoints use TLS by default. Enforce TLS 1.2 minimum and configure custom certificates for internal endpoints
Bedrock data encryption: Enable customer-managed KMS keys for Bedrock model customization data and ensure prompts are not logged by AWS

CloudTrail for ML Operations

⚠

Audit Gap: CloudTrail logs SageMaker API calls but does not capture the contents of training data or model predictions. For content-level auditing, implement custom logging within your ML application code and send logs to CloudWatch or S3.

API Call Logging

Enable CloudTrail for all SageMaker and Bedrock API calls. Monitor for suspicious patterns like bulk endpoint creation, unusual model downloads, or off-hours training jobs.

Data Access Logging

Enable S3 server access logging and CloudTrail data events for buckets containing ML data. Track who accessed training data and model artifacts.

Cost Anomaly Detection

Configure AWS Cost Anomaly Detection with alerts for ML services. A sudden spike in GPU instance usage may indicate compromised credentials being used for crypto mining or unauthorized training.

GuardDuty Integration

Enable GuardDuty for threat detection across your ML infrastructure. GuardDuty can detect unusual API calls, compromised credentials, and data exfiltration patterns.

💡

Next Up: In the next lesson, we explore GCP AI Security — Vertex AI security controls, VPC Service Controls, CMEK encryption, and Cloud Audit Logs for ML workloads.

← Previous Introduction Next → GCP AI Security

AWS AI Security

SageMaker Security Hardening

VPC-Only Mode

Execution Role Scoping

Network Isolation

Notebook Security