QA Engineer Skills 2026QA-2026AI Agents Reviewing Infrastructure Code

AI Agents Reviewing Infrastructure Code

Why AI Adds a Qualitative Layer

Rule-based tools like Checkov and tfsec check for known patterns. They answer "does this violate rule X?" But they cannot answer "is this change risky?" or "what will the operational impact be?" AI agents can bridge this gap by reasoning about infrastructure changes in context.

An AI agent reviewing a Terraform plan can explain that a resource replacement will cause downtime, suggest that an instance type is oversized for the workload, identify that a security group change creates an unintended exposure, or summarize a 200-line plan into a three-sentence human-readable description.


What AI Agents Excel At in IaC Reviews

Capability Rule-Based Tools AI Agent
Security misconfigurations Check against known patterns Reason about context-specific risk
Cost optimization Flag specific instance types Suggest right-sizing based on workload
Operational risk Cannot assess Explain impact of resource replacements
Environment drift Diff configs Explain intentional vs accidental differences
Change summarization List resources changed Natural language summary of what and why
Compliance mapping Check against rule IDs Map changes to specific compliance frameworks

Specific Examples

  1. Detecting security misconfigurations -- AI models trained on cloud security best practices can flag overly permissive IAM policies, unencrypted resources, and public network exposure, with explanations of the specific risk.

  2. Identifying drift between environments -- comparing staging and production Terraform to find intentional vs accidental differences. An AI can explain "staging has a smaller instance type (intentional cost saving) but also has a missing encryption setting (likely accidental)."

  3. Suggesting cost optimizations -- flagging oversized instances, missing reserved capacity, or resources that should use spot pricing, with estimated savings.

  4. Explaining complex changes -- translating a 200-line Terraform plan into a human-readable summary: "This change will replace the RDS instance (causing 5 minutes of downtime) and add a new read replica."


Prompt Pattern for IaC Review

The quality of AI review depends entirely on the prompt. A generic "review this code" prompt produces generic output. A structured prompt with specific review criteria produces actionable findings.

You are reviewing a Terraform pull request. Analyze the following changes for:

1. **Security**: Any resources without encryption? Public access? Overly permissive IAM?
   Flag any security group with 0.0.0.0/0 ingress. Flag any IAM policy with Action: "*".
2. **Cost**: Any instances larger than needed? Missing auto-scaling? Resources that should
   use spot pricing? Estimate monthly cost impact.
3. **Reliability**: Missing health checks? Single points of failure? No multi-AZ?
   Flag any RDS without multi-AZ. Flag any service with replicas < 2.
4. **Compliance**: Does this follow our tagging policy? Are resources in approved regions?
   Required tags: Environment, Team, CostCenter.
5. **Operational Risk**: Will any resources be replaced (causing downtime)?
   Will any data be lost? Are there any irreversible changes?

Terraform plan output:
```json
{plan_json}

Previous production configuration:

{existing_config}

Provide findings as a structured list with severity (CRITICAL/HIGH/MEDIUM/LOW). For each finding, include:

  • What the issue is
  • Why it matters
  • How to fix it

### Enhanced Prompt for Complex Reviews

```markdown
You are a senior infrastructure engineer reviewing a Terraform plan that modifies
production infrastructure. This is a high-stakes review.

Context:
- Cloud provider: AWS
- Environment: production (us-east-1)
- Service: payment processing (PCI-DSS scope)
- Current state: 3 AZs, multi-AZ RDS, ECS Fargate

Review the following plan for:

1. **Data loss risk**: Will any stateful resources (RDS, S3, DynamoDB) be destroyed
   or replaced? Flag with CRITICAL severity.
2. **Downtime risk**: Will any resource replacement cause service interruption?
   Estimate duration. Flag with HIGH severity.
3. **Security regression**: Does any change weaken the current security posture?
   Compare before/after for security groups, IAM, encryption. Flag with CRITICAL.
4. **PCI-DSS compliance**: Does this change maintain PCI-DSS requirements?
   (Encryption at rest, encryption in transit, access logging, network segmentation)
5. **Cost impact**: Estimate the monthly cost delta. Flag changes > $100/month.

Plan JSON:
{plan_json}

Checkov results:
{checkov_results}

Trivy results:
{trivy_results}

Output format:
## Summary
[2-3 sentence summary of what this plan does]

## Findings
| # | Severity | Category | Finding | Recommendation |
|---|----------|----------|---------|----------------|

Integrating AI Review into CI

GitHub Actions Workflow

# .github/workflows/iac-review.yml
name: AI Infrastructure Review
on:
  pull_request:
    paths:
      - 'terraform/**'
      - 'k8s/**'
      - 'Dockerfile*'

jobs:
  ai-review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Generate Terraform plan
        run: |
          cd terraform/
          terraform init
          terraform plan -out=tfplan
          terraform show -json tfplan > tfplan.json

      - name: Static analysis
        run: |
          checkov -d ./terraform/ -o json > checkov-results.json
          trivy config ./terraform/ --format json > trivy-results.json

      - name: AI review of changes
        run: |
          # Combine plan + static analysis for AI review
          python scripts/ai_iac_review.py \
            --plan terraform/tfplan.json \
            --checkov checkov-results.json \
            --trivy trivy-results.json \
            --output review-comment.md

      - name: Post review to PR
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const body = fs.readFileSync('review-comment.md', 'utf8');
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: body
            });

The AI Review Script

# scripts/ai_iac_review.py
import json
import argparse
import os
from anthropic import Anthropic

def load_json(path):
    with open(path) as f:
        return json.load(f)

def build_review_prompt(plan, checkov_results, trivy_results):
    """Build a structured prompt for AI review."""

    # Summarize the plan
    changes = plan.get("resource_changes", [])
    creates = [c["address"] for c in changes if "create" in c["change"]["actions"]]
    updates = [c["address"] for c in changes if "update" in c["change"]["actions"]]
    destroys = [c["address"] for c in changes if "delete" in c["change"]["actions"]]

    # Summarize static analysis findings
    checkov_failures = [
        r for r in checkov_results.get("results", {}).get("failed_checks", [])
    ]
    trivy_findings = trivy_results.get("Results", [])

    prompt = f"""Review this Terraform infrastructure change:

## Plan Summary
- Resources to create: {len(creates)} ({', '.join(creates[:10])})
- Resources to update: {len(updates)} ({', '.join(updates[:10])})
- Resources to destroy: {len(destroys)} ({', '.join(destroys[:10])})

## Full Plan
```json
{json.dumps(plan, indent=2)[:10000]}

Checkov Findings ({len(checkov_failures)} failures)

{json.dumps(checkov_failures[:20], indent=2)}

Trivy Findings

{json.dumps(trivy_findings[:20], indent=2)}

Provide a structured review with severity ratings."""

return prompt

def main(): parser = argparse.ArgumentParser() parser.add_argument("--plan", required=True) parser.add_argument("--checkov", required=True) parser.add_argument("--trivy", required=True) parser.add_argument("--output", required=True) args = parser.parse_args()

plan = load_json(args.plan)
checkov = load_json(args.checkov)
trivy = load_json(args.trivy)

prompt = build_review_prompt(plan, checkov, trivy)

client = Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    messages=[{"role": "user", "content": prompt}],
    system="You are a senior infrastructure security engineer. Review the Terraform plan and static analysis results. Provide actionable findings with severity ratings.",
)

review = response.content[0].text

with open(args.output, "w") as f:
    f.write("## AI Infrastructure Review\n\n")
    f.write(review)
    f.write("\n\n---\n*Generated by AI review pipeline*")

if name == "main": main()


---

## The Complete IaC Testing Pipeline

+---------------------------------------------------------------+ | Developer pushes Terraform + K8s changes | | | | Stage 1: Static (seconds) | | +-- terraform fmt --check | | +-- terraform validate | | +-- checkov / tfsec / trivy config | | +-- kubeval / kubeconform | | +-- OPA/Conftest custom policies | | | | Stage 2: Plan Analysis (30 seconds) | | +-- terraform plan -> JSON | | +-- Automated plan assertions (no destroys, no public SGs) | | +-- AI agent review of plan diff | | | | Stage 3: Ephemeral Deploy (5-15 minutes) | | +-- terraform apply to PR workspace | | +-- Wait for health checks | | +-- Post preview URL | | | | Stage 4: Integration Tests (5-20 minutes) | | +-- API tests against ephemeral environment | | +-- E2E browser tests | | +-- Security scan (DAST) | | +-- Performance baseline | | | | Stage 5: Cleanup (on merge/close) | | +-- terraform destroy PR workspace | +---------------------------------------------------------------+


---

## Key Takeaways

- **Static analysis is free** -- run tfsec, Checkov, and OPA on every commit. There is no excuse for skipping it.
- **Plan assertions catch what static analysis cannot** -- analyzing the actual plan output reveals resource replacements, destroy operations, and drift.
- **Terratest is expensive but irreplaceable** -- some things can only be verified by deploying real infrastructure. Reserve this for critical modules.
- **Container scanning is a continuous process** -- images accumulate new CVEs daily. Scan on build and periodically in your registry.
- **Ephemeral environments transform confidence** -- testing against a real copy of your infrastructure eliminates "works on my machine" for cloud resources.
- **AI agents add a qualitative layer** -- they can explain complex plans, suggest optimizations, and catch patterns that rule-based tools miss.

---

## Interview Talking Point

> "I believe infrastructure testing should follow the same pyramid as application testing: cheap static analysis at the base, plan-time assertions in the middle, and real-infrastructure integration tests at the top. In my experience, the highest-value practice is ephemeral environments per PR, where you spin up a complete infrastructure copy using Terraform workspaces or Pulumi stacks, run the full test suite against it, and destroy it on merge. This catches issues that static analysis cannot -- like IAM policies that are syntactically valid but operationally broken. I also integrate AI agents into the review process: they analyze the Terraform plan JSON alongside Checkov and tfsec results and post a structured review comment on the PR that summarizes what will change, flags security risks, and estimates cost impact."