Advanced

CI/CD Pipeline Safety

The most structurally safe pattern for AI agents and infrastructure is simple: agents should never directly apply infrastructure changes. Instead, agents propose changes through pull requests, and CI/CD pipelines apply them after human review.

Why Agents Should Never Directly Apply

Even with all the guardrails from previous lessons, allowing an AI agent to directly run terraform apply or kubectl apply carries inherent risk. The GitOps pattern eliminates this risk structurally:

💡
Key principle: If the agent can only create git commits and open pull requests, then the worst it can do is create a bad PR. A bad PR is easily reviewed, commented on, and closed. A bad terraform destroy requires incident response.

The GitOps Workflow for AI Agents

  1. Agent Creates Branch

    The AI agent creates a feature branch: git checkout -b agent/fix-scaling-config

  2. Agent Makes Changes

    The agent edits Terraform files, Kubernetes manifests, or other IaC code on the branch.

  3. Agent Commits and Pushes

    Changes are committed with clear messages: git commit -m "feat: update autoscaling min/max for production ECS service"

  4. Agent Opens PR

    The agent creates a pull request with a description of what changed and why.

  5. CI Runs Plan and Checks

    GitHub Actions (or similar) runs terraform plan, security scanning, cost estimation, and policy checks.

  6. Human Reviews

    A team member reviews the plan output, the code diff, and all automated check results.

  7. Merge Triggers Apply

    Only after approval and merge does the pipeline run terraform apply with the saved plan.

GitHub Actions: Terraform Plan on PR

YAML - .github/workflows/terraform-plan.yml
name: Terraform Plan

on:
  pull_request:
    paths:
      - 'terraform/**'

permissions:
  pull-requests: write
  contents: read

jobs:
  plan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.9.0

      - name: Terraform Init
        working-directory: terraform/
        run: terraform init

      - name: Terraform Plan
        id: plan
        working-directory: terraform/
        run: |
          terraform plan -no-color -out=plan.tfplan 2>&1 | tee plan-output.txt
          echo "plan_exit_code=$?" >> $GITHUB_OUTPUT

      - name: Check for Destructive Changes
        id: safety
        run: |
          terraform show -json plan.tfplan > plan.json
          # Count resources being destroyed
          DESTROYS=$(jq '[.resource_changes[] |
            select(.change.actions | contains(["delete"]))] |
            length' plan.json)
          echo "destroys=$DESTROYS" >> $GITHUB_OUTPUT

          if [ "$DESTROYS" -gt 0 ]; then
            echo "has_destroys=true" >> $GITHUB_OUTPUT
            echo "## Destructive Changes Detected" >> destroy-report.md
            jq -r '.resource_changes[] |
              select(.change.actions | contains(["delete"])) |
              "- **\(.type).\(.name)** will be DESTROYED"' plan.json >> destroy-report.md
          fi
        working-directory: terraform/

      - name: Comment Plan on PR
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const plan = fs.readFileSync('terraform/plan-output.txt', 'utf8');
            const destroys = '${{ steps.safety.outputs.destroys }}';
            const warning = destroys > 0
              ? `\n\n> **WARNING: ${destroys} resource(s) will be DESTROYED. Extra review required.**\n`
              : '';

            const body = `### Terraform Plan Output ${warning}
            \`\`\`\n${plan.substring(0, 60000)}\n\`\`\`
            *Plan generated by CI on commit ${{ github.sha }}*`;

            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: body
            });

      - name: Block PR if Destructive
        if: steps.safety.outputs.has_destroys == 'true'
        run: |
          echo "::error::Destructive changes detected. Requires senior reviewer approval."
          exit 1

GitHub Actions: Terraform Apply on Merge

YAML - .github/workflows/terraform-apply.yml
name: Terraform Apply

on:
  push:
    branches: [main]
    paths:
      - 'terraform/**'

jobs:
  apply:
    runs-on: ubuntu-latest
    environment: production  # Requires manual approval

    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3

      - name: Terraform Init
        working-directory: terraform/
        run: terraform init

      - name: Terraform Plan
        working-directory: terraform/
        run: terraform plan -out=plan.tfplan

      - name: Terraform Apply
        working-directory: terraform/
        run: terraform apply plan.tfplan

Terraform Cloud with Sentinel Policies

Terraform Cloud/Enterprise offers Sentinel, a policy-as-code framework that evaluates between plan and apply:

Sentinel - Block Resource Destruction Policy
# sentinel/block-destroys.sentinel

import "tfplan/v2" as tfplan

# Get all resources being destroyed
destroyed_resources = filter tfplan.resource_changes as _, rc {
    rc.change.actions contains "delete"
}

# Get all resources with prevent_destroy that are being destroyed
protected_destroys = filter destroyed_resources as _, rc {
    rc.type in ["aws_rds_instance", "aws_s3_bucket",
                "aws_dynamodb_table", "aws_efs_file_system"]
}

# Policy: block destruction of protected resource types
main = rule {
    length(protected_destroys) is 0
}

# Advisory: warn on any destruction
advisory_no_destroys = rule {
    length(destroyed_resources) is 0
}

Approval Gates in CI/CD

GitHub Environments with required reviewers create mandatory approval gates:

YAML - GitHub Environment Protection Rules
# Configure in GitHub Settings > Environments > production
# Required reviewers: team-leads
# Wait timer: 5 minutes (gives time to cancel)
# Deployment branches: main only

# In your workflow:
jobs:
  deploy-production:
    runs-on: ubuntu-latest
    environment:
      name: production
      url: https://app.example.com
    steps:
      # This job won't start until a required reviewer approves
      - name: Deploy
        run: |
          echo "Deploying to production..."
          terraform apply plan.tfplan

Branch Protection Rules

Configure branch protection to ensure agents can't bypass the review process:

RulePurposeSetting
Require PR reviewsNo direct pushes to mainAt least 1 reviewer
Require status checksCI must pass before mergeterraform-plan must pass
Require CODEOWNERS reviewIaC changes need infra teamCODEOWNERS file
Dismiss stale reviewsNew pushes require re-reviewEnabled
No force pushesPrevent history rewritingEnabled
Restrict push accessOnly CI bot can push to mainInclude CI bot only
CODEOWNERS - Require Infra Review for IaC
# .github/CODEOWNERS
# Terraform changes require infrastructure team review
/terraform/         @org/infrastructure-team
*.tf                @org/infrastructure-team
*.tfvars            @org/infrastructure-team

# Kubernetes manifests require platform team review
/k8s/               @org/platform-team
*.yaml              @org/platform-team

# CI/CD workflow changes require DevOps team review
/.github/workflows/ @org/devops-team

Automated Plan Review: Detecting Destruction in PRs

Python - PR Plan Analyzer
import json
import sys

def analyze_plan(plan_json_path: str) -> dict:
    """Analyze a Terraform plan JSON and return a safety report."""
    with open(plan_json_path) as f:
        plan = json.load(f)

    report = {
        "creates": [],
        "updates": [],
        "destroys": [],
        "replaces": [],
        "risk_level": "low",
        "blocking_issues": [],
    }

    HIGH_RISK_TYPES = [
        "aws_rds_instance", "aws_rds_cluster",
        "aws_dynamodb_table", "aws_s3_bucket",
        "aws_efs_file_system", "aws_elasticache_cluster",
        "azurerm_sql_database", "azurerm_storage_account",
        "google_sql_database_instance", "google_storage_bucket",
    ]

    for rc in plan.get("resource_changes", []):
        actions = rc["change"]["actions"]
        resource = f"{rc['type']}.{rc['name']}"

        if actions == ["create"]:
            report["creates"].append(resource)
        elif actions == ["update"]:
            report["updates"].append(resource)
        elif "delete" in actions and "create" in actions:
            report["replaces"].append(resource)
            if rc["type"] in HIGH_RISK_TYPES:
                report["risk_level"] = "critical"
                report["blocking_issues"].append(
                    f"High-risk resource {resource} is being REPLACED"
                )
        elif "delete" in actions:
            report["destroys"].append(resource)
            report["risk_level"] = "high"
            if rc["type"] in HIGH_RISK_TYPES:
                report["risk_level"] = "critical"
                report["blocking_issues"].append(
                    f"High-risk resource {resource} is being DESTROYED"
                )

    return report

if __name__ == "__main__":
    report = analyze_plan(sys.argv[1])
    print(json.dumps(report, indent=2))
    if report["blocking_issues"]:
        sys.exit(1)
Remember: The CI/CD pipeline's credentials should be the only credentials with write access to production infrastructure. Developers and agents should have read-only access. This structurally prevents any direct-apply path.

Key Takeaways

  • Agents should never directly apply infrastructure changes — always go through PR and pipeline
  • GitOps workflow: agent creates branch, opens PR, CI plans, human reviews, pipeline applies
  • Post Terraform plan output as PR comments so reviewers see exactly what will change
  • Use Sentinel or OPA policies to block destructive changes at the pipeline level
  • GitHub Environments with required reviewers create mandatory approval gates
  • Branch protection + CODEOWNERS ensures IaC changes get the right review
  • Only the CI/CD pipeline should have production write credentials