Infrastructure & DevOps14 min readFebruary 12, 2026

Terraform in Production: Patterns That Keep Infrastructure Sane

E. Lopez

CTO

Terraform in Production: Patterns That Keep Infrastructure Sane

Terraform is the standard for infrastructure as code, but most teams only scratch the surface of what it can do. After managing dozens of production environments, here are the patterns that keep infrastructure maintainable, safe, and team-friendly at scale.

Project Structure That Scales

The biggest mistake teams make is treating Terraform like a single script. As your infrastructure grows, a flat structure becomes unmanageable.

The Module Pattern

Split your infrastructure into reusable modules. A module is a self-contained unit that encapsulates a logical piece of infrastructure — a VPC, an ECS cluster, a database, a CDN distribution.

```

infrastructure/

modules/

vpc/

ecs-cluster/

rds/

cdn/

environments/

staging/

production/

shared/

state-backend/

iam/

```

Each environment directory calls the shared modules with environment-specific variables. Staging and production use identical module code — only the inputs differ.

Separate State Per Environment

Never share Terraform state between staging and production. A corrupted or accidentally applied state file in production is a serious incident. Use separate S3 buckets (or equivalent) with separate DynamoDB lock tables for each environment.

State Management

Remote state is non-negotiable for team environments. Local state files get lost, corrupted, and cause conflicts.

S3 Backend with Locking

```hcl

terraform {

backend "s3" {

bucket = "your-org-terraform-state-prod"

key = "services/api/terraform.tfstate"

region = "us-east-1"

dynamodb_table = "terraform-state-lock"

encrypt = true

}

}

```

The DynamoDB table prevents two engineers from running `terraform apply` simultaneously — a race condition that can corrupt state.

State File Security

Your state file contains sensitive values — database passwords, API keys, private IPs. Ensure:

  • S3 bucket has versioning enabled (so you can roll back)
  • Server-side encryption is on
  • Bucket is not publicly accessible
  • Access is restricted to the CI/CD role and specific engineers

CI/CD Integration

Manual `terraform apply` from a developer's laptop is a liability. Every infrastructure change should go through a pull request and an automated pipeline.

The GitOps Workflow

1. Engineer opens a PR with infrastructure changes

2. CI runs `terraform plan` and posts the output as a PR comment

3. A second engineer reviews the plan — not just the code, but the actual diff

4. PR is merged to main

5. CD pipeline runs `terraform apply` automatically

This gives you a full audit trail of every infrastructure change, who approved it, and what the plan showed before it was applied.

Plan Output in PRs

Use a tool like Atlantis or a custom GitHub Actions workflow to post the plan output directly in the PR. Reviewers should be looking at what Terraform will actually do, not just the HCL diff.

Preventing Drift

Infrastructure drift — when the real state of your infrastructure diverges from what Terraform thinks it is — is one of the most common sources of production incidents.

Automated Drift Detection

Run `terraform plan` on a schedule (daily is usually sufficient) and alert if the plan is non-empty. A non-empty plan means something changed outside of Terraform.

```yaml

# GitHub Actions scheduled drift detection

on:

schedule:

jobs:

drift-check:

steps:

```

Exit code 2 means changes are pending. Alert your team.

  • cron: '0 8 * * *'
  • run: terraform plan -detailed-exitcode

Enforce Terraform-Only Changes

Use IAM policies to restrict who can make changes to production infrastructure. The CI/CD role should have the permissions needed to apply Terraform. Human engineers should have read-only access to production by default.

Variable Management

Hardcoding values in Terraform is a fast path to security incidents and configuration drift.

Separate Variables by Sensitivity

```hcl

variable "db_password" {

description = "Database master password"

type = string

sensitive = true

}

```

The `sensitive = true` flag prevents the value from appearing in plan output or logs.

  • Non-sensitive variables: store in `terraform.tfvars` files, committed to the repo
  • Sensitive variables: store in AWS Secrets Manager, HashiCorp Vault, or your CI/CD secret store — never in the repo

Tagging Strategy

Every resource should be tagged consistently. Tags are how you track costs, identify owners, and automate operations.

```hcl

locals {

common_tags = {

Environment = var.environment

Project = var.project_name

ManagedBy = "terraform"

Owner = var.team_name

CostCenter = var.cost_center

}

}

```

Apply `local.common_tags` to every resource. This makes cost allocation, security audits, and cleanup operations dramatically easier.

Testing Infrastructure Changes

Terraform changes can have cascading effects that are not obvious from the plan output. A few practices that catch problems before they reach production.

Always Test in Staging First

This sounds obvious, but it is frequently skipped under time pressure. Staging should be as close to production as possible — same module versions, same configuration, smaller instance sizes.

Use `terraform plan -target` Carefully

Targeted applies are useful for emergencies but dangerous as a habit. They can leave your state in an inconsistent condition. Prefer full applies whenever possible.

Validate Before Apply

```bash

terraform fmt -check

terraform validate

terraform plan -out=tfplan

```

Run these in CI before any apply. `terraform validate` catches syntax errors and obvious configuration mistakes. `terraform fmt -check` enforces consistent formatting.

The Patterns That Matter Most

After managing infrastructure for dozens of production systems, the practices that prevent the most incidents are:

1. Remote state with locking — always

2. Every change through a PR with plan review

3. Separate state per environment

4. Automated drift detection

5. Sensitive values never in the repo

Get these right and Terraform becomes a reliable, auditable foundation for your infrastructure. Skip them and you will eventually have a bad day.

#Terraform#IaC#DevOps#AWS

About E. Lopez

CTO at DreamTech Dynamics