Terraform’s getting-started guide shows a single main.tf with everything in it. That works for demos. It doesn’t work when you have 50 resources, 5 environments, and a team making changes simultaneously.

These patterns emerge from scaling Terraform across teams and environments—where state conflicts happen, where modules get copied instead of shared, and where “just run terraform apply” becomes terrifying.

Project Structure

The flat-file approach breaks down fast. Structure by environment and component:

terramegfonlodvoruibmlverlrdspaids/epcdaoetrlan3scssmnvao/ms-//-/bm/gd//scdemvotitlanaauenau/tirtrgtssnipret/.aua/etbtfrflso/e.rstm.f.ttffvars##RSehuasraebdleacmroodsusleesnvironments

Key principles:

  • Each environment is a separate Terraform root
  • Modules are shared, configuration differs
  • Global resources live separately

State Management

State is Terraform’s memory. Lose it, and Terraform forgets what it created.

Remote state with S3 + DynamoDB locking:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# backend.tf
terraform {
  backend "s3" {
    bucket         = "mycompany-terraform-state"
    key            = "environments/prod/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-lock"
  }
}

State isolation per environment:

s3://egmnlyvocibordspaidmoetrlanpnvao/msam/gd//neti/ttynentee-trgerrtsr/rrre/atraarfeaffrorfooarrorrfmarmmo.fm..rto.ttmfrtff-smfssst.stttattaaatfatttesteeete/ate

Never share state between environments. A bad apply in dev shouldn’t affect prod’s state file.

Module Design

Good modules are reusable without being over-engineered.

Module structure:

modulmvovReaaueEsirtrA/nipsDe.auiMctbtoEsflsn.-e.smcst.dl.ftutfsfter/#####RIOPUenurssptoaoupvgutuierstdcsederoscruemqeunitraetmieonnts

Variables with sensible defaults:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# modules/ecs-cluster/variables.tf

variable "cluster_name" {
  description = "Name of the ECS cluster"
  type        = string
}

variable "instance_type" {
  description = "EC2 instance type for cluster nodes"
  type        = string
  default     = "t3.medium"
}

variable "min_size" {
  description = "Minimum number of instances"
  type        = number
  default     = 2
}

variable "max_size" {
  description = "Maximum number of instances"
  type        = number
  default     = 10
}

variable "tags" {
  description = "Tags to apply to all resources"
  type        = map(string)
  default     = {}
}

Expose useful outputs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# modules/ecs-cluster/outputs.tf

output "cluster_id" {
  description = "ID of the ECS cluster"
  value       = aws_ecs_cluster.main.id
}

output "cluster_arn" {
  description = "ARN of the ECS cluster"
  value       = aws_ecs_cluster.main.arn
}

output "security_group_id" {
  description = "Security group ID for cluster instances"
  value       = aws_security_group.cluster.id
}

Module usage:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# environments/prod/main.tf

module "ecs_cluster" {
  source = "../../modules/ecs-cluster"

  cluster_name  = "prod-api"
  instance_type = "t3.large"
  min_size      = 3
  max_size      = 20

  tags = {
    Environment = "prod"
    Team        = "platform"
  }
}

Variable Management

Don’t hardcode. Use tfvars files per environment:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# environments/prod/terraform.tfvars

environment    = "prod"
instance_type  = "t3.large"
min_instances  = 3
max_instances  = 20
enable_logging = true

database_config = {
  instance_class = "db.r5.large"
  storage_gb     = 500
  multi_az       = true
}

Sensitive values via environment variables:

1
2
3
# Don't put secrets in tfvars
export TF_VAR_database_password="$DATABASE_PASSWORD"
terraform apply

Or use data sources:

1
2
3
4
5
6
7
8
data "aws_secretsmanager_secret_version" "db_password" {
  secret_id = "prod/database/password"
}

resource "aws_db_instance" "main" {
  password = data.aws_secretsmanager_secret_version.db_password.secret_string
  # ...
}

Data Sources for Loose Coupling

Reference existing resources without tight module dependencies:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Instead of passing VPC ID through every module
data "aws_vpc" "main" {
  tags = {
    Name = "${var.environment}-vpc"
  }
}

data "aws_subnets" "private" {
  filter {
    name   = "vpc-id"
    values = [data.aws_vpc.main.id]
  }

  tags = {
    Tier = "private"
  }
}

resource "aws_ecs_service" "api" {
  network_configuration {
    subnets = data.aws_subnets.private.ids
  }
}

Benefits:

  • Modules don’t need VPC ID passed explicitly
  • Infrastructure can be queried dynamically
  • Less coupling between state files

Lifecycle Rules

Control how Terraform handles resource changes:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
resource "aws_instance" "main" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = var.instance_type

  lifecycle {
    # Don't destroy before creating replacement
    create_before_destroy = true

    # Ignore changes made outside Terraform
    ignore_changes = [
      tags["LastModified"],
    ]

    # Never destroy this resource
    prevent_destroy = true
  }
}

Common patterns:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Database - never accidentally destroy
lifecycle {
  prevent_destroy = true
}

# Auto-scaled instances - ignore count changes
lifecycle {
  ignore_changes = [desired_capacity]
}

# Blue-green deployments - create new before destroying old
lifecycle {
  create_before_destroy = true
}

Workspaces vs Directories

Terraform workspaces provide state isolation within a single configuration:

1
2
3
terraform workspace new staging
terraform workspace select staging
terraform apply

When to use workspaces:

  • Same configuration, different instances (multi-tenant)
  • Quick environment switching
  • Simple projects

When to use directories:

  • Different configurations per environment
  • Different provider versions needed
  • Team collaboration (less confusion)

My recommendation: Use directories for environments. Workspaces are too easy to mess up (“which workspace am I in?”).

Remote State Data Sources

Share outputs between state files without tight coupling:

1
2
3
4
5
6
7
8
# environments/prod/networking/outputs.tf
output "vpc_id" {
  value = aws_vpc.main.id
}

output "private_subnet_ids" {
  value = aws_subnet.private[*].id
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# environments/prod/application/main.tf
data "terraform_remote_state" "networking" {
  backend = "s3"
  config = {
    bucket = "mycompany-terraform-state"
    key    = "environments/prod/networking/terraform.tfstate"
    region = "us-east-1"
  }
}

resource "aws_ecs_service" "api" {
  network_configuration {
    subnets = data.terraform_remote_state.networking.outputs.private_subnet_ids
  }
}

Caveats:

  • Creates implicit dependency on other state
  • Outputs must be stable (changing them breaks consumers)
  • Consider using SSM Parameter Store instead for more flexibility

Moved Blocks

Refactor without destroying resources:

1
2
3
4
5
6
7
# Old: resource "aws_instance" "web" { ... }
# New: module "web" { source = "./modules/web-server" }

moved {
  from = aws_instance.web
  to   = module.web.aws_instance.main
}

Run terraform plan to verify the move, then apply. The resource stays intact.

Import Existing Resources

Bring existing infrastructure under Terraform management:

1
2
3
4
5
# Import existing resource
terraform import aws_instance.web i-1234567890abcdef0

# Generate configuration (Terraform 1.5+)
terraform plan -generate-config-out=generated.tf

Import workflow:

  1. Write resource block with expected arguments
  2. Run terraform import
  3. Run terraform plan to verify no changes
  4. Adjust configuration until plan shows no diff

CI/CD Integration

Automate plan and apply:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# .github/workflows/terraform.yml
name: Terraform

on:
  pull_request:
    paths: ['terraform/**']
  push:
    branches: [main]
    paths: ['terraform/**']

jobs:
  plan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3

      - name: Terraform Init
        run: terraform init
        working-directory: terraform/environments/prod

      - name: Terraform Plan
        run: terraform plan -out=tfplan
        working-directory: terraform/environments/prod

      - name: Upload Plan
        uses: actions/upload-artifact@v4
        with:
          name: tfplan
          path: terraform/environments/prod/tfplan

  apply:
    needs: plan
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: production
    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3

      - name: Download Plan
        uses: actions/download-artifact@v4
        with:
          name: tfplan
          path: terraform/environments/prod

      - name: Terraform Apply
        run: terraform apply -auto-approve tfplan
        working-directory: terraform/environments/prod

Key practices:

  • Plan on PR, apply on merge to main
  • Use GitHub environments for approval gates
  • Save plan file to ensure apply matches plan

Common Mistakes

1. Not using -target carefully:

1
2
# Dangerous - can leave state inconsistent
terraform apply -target=aws_instance.web

Use only for debugging, never in automation.

2. Forgetting state locking: Two people run apply simultaneously → state corruption. Always use DynamoDB locking with S3 backend.

3. Hardcoding provider versions:

1
2
3
4
5
6
7
8
9
# Do this
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

4. Giant monolithic state: Split by component. Network, compute, and database don’t need to be in the same state file.

Quick Reference

PatternUse Case
Separate directoriesEnvironment isolation
Remote stateTeam collaboration
ModulesCode reuse
Data sourcesLoose coupling
Lifecycle rulesResource protection
Moved blocksRefactoring

Terraform at scale is about state management and module design. Get those right, and everything else follows.


Infrastructure as code is only as good as your code organization. Treat it like software engineering, because it is.