Sandbox and Isolation Strategies
The most effective way to prevent AI agents from causing infrastructure damage is to ensure they never touch real infrastructure in the first place. This lesson covers sandboxing techniques from Docker containers to full cloud emulators.
Why Sandboxing Matters
Even with dry-run patterns and permission models, there's always a risk that an AI agent will find a way to execute a destructive command. Sandboxing provides a structural guarantee that the agent physically cannot reach production resources:
Docker Containers for Agent Execution
Running your AI agent inside a Docker container provides process isolation, filesystem isolation, and network control:
# Dockerfile.agent-sandbox FROM ubuntu:24.04 # Install development tools RUN apt-get update && apt-get install -y \ git curl python3 python3-pip nodejs npm \ terraform kubectl helm \ && rm -rf /var/lib/apt/lists/* # Create non-root user for agent RUN useradd -m -s /bin/bash agent USER agent WORKDIR /home/agent/workspace # Copy project files (mounted at runtime) # No cloud credentials are baked into the image CMD ["/bin/bash"]
# docker-compose.agent-sandbox.yml version: '3.8' services: agent-sandbox: build: context: . dockerfile: Dockerfile.agent-sandbox volumes: - ./project:/home/agent/workspace:rw # DO NOT mount ~/.aws, ~/.kube, or other credential dirs networks: - sandbox-net mem_limit: 4g cpus: 2 security_opt: - no-new-privileges:true read_only: true tmpfs: - /tmp:rw,size=500m - /home/agent/.cache:rw,size=200m # Local services for testing localstack: image: localstack/localstack:latest ports: - "4566:4566" environment: - SERVICES=s3,sqs,dynamodb,lambda,iam,ec2 networks: - sandbox-net azurite: image: mcr.microsoft.com/azure-storage/azurite:latest ports: - "10000:10000" - "10001:10001" - "10002:10002" networks: - sandbox-net networks: sandbox-net: driver: bridge internal: true # No external internet access
Dedicated Cloud Accounts for Agent Testing
For testing that requires real cloud APIs, create completely separate accounts/projects:
| Cloud | Isolation Strategy | Billing Protection |
|---|---|---|
| AWS | Separate AWS account in an Organization OU with SCPs | Budget alerts + hard limits via SCPs |
| Azure | Separate subscription with spending cap | Budget alerts + spending limits |
| GCP | Separate project with billing budget | Budget alerts + billing cap |
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DenyExpensiveServices",
"Effect": "Deny",
"Action": [
"redshift:*",
"sagemaker:CreateNotebookInstance",
"ec2:RunInstances"
],
"Resource": "*",
"Condition": {
"ForAnyValue:StringNotLike": {
"ec2:InstanceType": ["t3.micro", "t3.small"]
}
}
},
{
"Sid": "DenyRegionsOutsideUS",
"Effect": "Deny",
"Action": "*",
"Resource": "*",
"Condition": {
"StringNotEquals": {
"aws:RequestedRegion": ["us-east-1", "us-west-2"]
}
}
}
]
}
LocalStack for AWS Testing
LocalStack emulates AWS services locally, letting AI agents interact with S3, DynamoDB, Lambda, EC2, and more without touching real AWS:
# Start LocalStack docker run -d --name localstack -p 4566:4566 localstack/localstack # Configure AWS CLI to point to LocalStack export AWS_ENDPOINT_URL=http://localhost:4566 export AWS_ACCESS_KEY_ID=test export AWS_SECRET_ACCESS_KEY=test export AWS_DEFAULT_REGION=us-east-1 # Now the agent's AWS commands go to LocalStack, not real AWS aws s3 mb s3://my-test-bucket aws dynamodb create-table --table-name Users \ --attribute-definitions AttributeName=id,AttributeType=S \ --key-schema AttributeName=id,KeyType=HASH \ --billing-mode PAY_PER_REQUEST # Even destructive commands are safe! aws s3 rb s3://my-test-bucket --force # Only affects LocalStack
Azurite for Azure Local Testing
Azurite provides local emulation for Azure Blob Storage, Queue Storage, and Table Storage:
# Install and run Azurite npm install -g azurite azurite --location ./azurite-data --debug ./azurite-debug.log # Or use Docker docker run -d --name azurite -p 10000:10000 -p 10001:10001 -p 10002:10002 \ mcr.microsoft.com/azure-storage/azurite # Configure connection string for local Azurite export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;" # Agent can now use Azure Storage commands safely az storage container create --name test-container az storage blob upload --container-name test-container --file ./data.json --name data.json
GCP Emulators
Google Cloud provides official emulators for several services:
# Pub/Sub emulator gcloud beta emulators pubsub start --project=test-project # Datastore emulator gcloud beta emulators datastore start --project=test-project # Bigtable emulator gcloud beta emulators bigtable start # Firestore emulator gcloud beta emulators firestore start --project=test-project # Set environment variables to point to emulators $(gcloud beta emulators pubsub env-init) $(gcloud beta emulators datastore env-init)
Feature Branches + Ephemeral Environments
Combine git feature branches with ephemeral cloud environments so agents work in isolated copies:
name: Ephemeral Environment
on:
pull_request:
types: [opened, synchronize]
jobs:
deploy-ephemeral:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Create ephemeral namespace
run: |
NAMESPACE="pr-${{ github.event.pull_request.number }}"
kubectl create namespace $NAMESPACE --dry-run=client -o yaml | kubectl apply -f -
helm upgrade --install app-$NAMESPACE ./chart \
--namespace $NAMESPACE \
--set image.tag=${{ github.sha }} \
--set env=ephemeral
- name: Run agent tests
run: |
NAMESPACE="pr-${{ github.event.pull_request.number }}"
# Agent's changes are tested in isolated namespace
kubectl -n $NAMESPACE run tests --image=test-runner --rm -it
cleanup-ephemeral:
runs-on: ubuntu-latest
if: github.event.action == 'closed'
steps:
- name: Delete ephemeral namespace
run: |
NAMESPACE="pr-${{ github.event.pull_request.number }}"
kubectl delete namespace $NAMESPACE
GitOps: Agents Propose, Humans Approve
The safest workflow for AI agents and infrastructure is GitOps: agents make changes in git, humans review and approve, and the CI/CD pipeline applies:
Agent Creates a Branch
The AI agent creates a feature branch and commits its infrastructure changes (Terraform files, Kubernetes manifests, Helm values).
Agent Opens a Pull Request
The agent opens a PR with a description of the changes, including the plan output.
Automated Checks Run
CI runs terraform plan, security scanning, policy checks, and cost estimation on the PR.
Human Reviews and Approves
A human reviews the plan output, the code changes, and the automated check results.
Pipeline Applies on Merge
Only after merge does the CI/CD pipeline run terraform apply. The agent never directly applies.
Key Takeaways
- Sandboxing provides structural safety that doesn't depend on the agent's behavior
- Use Docker with
--network nonefor complete isolation from cloud services - LocalStack, Azurite, and GCP emulators let agents test cloud operations locally
- Create dedicated cloud accounts with SCPs/budgets for agent testing that needs real APIs
- Ephemeral environments per PR give agents isolated playgrounds
- GitOps is the ultimate safety pattern: agents propose, humans approve, pipelines apply
Lilly Tech Systems