Terraform’s getting-started guide shows a single main.tf with everything in it. That works for demos. It doesn’t work when you have 50 resources, 5 environments, and a team making changes simultaneously.
These patterns emerge from scaling Terraform across teams and environments—where state conflicts happen, where modules get copied instead of shared, and where “just run terraform apply” becomes terrifying.
Project Structure# The flat-file approach breaks down fast. Structure by environment and component:
t ├ │ │ │ │ ├ │ │ │ │ │ │ │ │ │ └ e ─ ─ ─ r ─ ─ ─ r a m ├ ├ ├ └ e ├ │ │ │ │ ├ │ └ g ├ ├ └ f o ─ ─ ─ ─ n ─ ─ ─ l ─ ─ ─ o d ─ ─ ─ ─ v ─ ─ ─ o ─ ─ ─ r u i b m l v e r l r d ├ ├ ├ └ s └ p └ a i d s / e p c d a o e ─ ─ ─ ─ t ─ r ─ l a n 3 s c s s m n v ─ ─ ─ ─ a ─ o ─ / m s - / / - / b m / g d / / s c d e m v o t i t l a n a a u e n a u / t i r t r g t s s n i p r e t / . a u a / e t b t f r f l s o / e . r s t m . f . t t f f v a r s # # R S e h u a s r a e b d l e a c m r o o d s u s l e e s n v i r o n m e n t s Key principles:
Each environment is a separate Terraform root Modules are shared, configuration differs Global resources live separately State Management# State is Terraform’s memory. Lose it, and Terraform forgets what it created.
Remote state with S3 + DynamoDB locking:
1
2
3
4
5
6
7
8
9
10
# backend.tf
terraform {
backend "s3" {
bucket = "mycompany-terraform-state"
key = "environments/prod/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-state-lock"
}
}
State isolation per environment:
s ├ │ │ │ └ 3 ─ ─ : ─ ─ / / e ├ ├ └ g ├ └ m n ─ ─ ─ l ─ ─ y v ─ ─ ─ o ─ ─ c i b o r d s p a i d m o e t r l a n p n v a o / m s a m / g d / / n e t i / t t y n e n t e e - t r g e r r t s r / r r r e / a t r a a r f e a f f r o r f o o a r r o r r f m a r m m o . f m . . r t o . t t m f r t f f - s m f s s s t . s t t t a t t a a a t f a t t t e s t e e e t e / a t e
Never share state between environments. A bad apply in dev shouldn’t affect prod’s state file.
Module Design# Good modules are reusable without being over-engineered.
Module structure:
m ├ ├ ├ ├ └ o ─ ─ ─ ─ ─ d ─ ─ ─ ─ ─ u l m v o v R e a a u e E s i r t r A / n i p s D e . a u i M c t b t o E s f l s n . - e . s m c s t . d l . f t u t f s f t e r / # # # # # R I O P U e n u r s s p t o a o u p v g u t u i e r s t d c s e d e r o s c r u e m q e u n i t r a e t m i e o n n t s
Variables with sensible defaults:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# modules/ecs-cluster/variables.tf
variable "cluster_name" {
description = "Name of the ECS cluster"
type = string
}
variable "instance_type" {
description = "EC2 instance type for cluster nodes"
type = string
default = "t3.medium"
}
variable "min_size" {
description = "Minimum number of instances"
type = number
default = 2
}
variable "max_size" {
description = "Maximum number of instances"
type = number
default = 10
}
variable "tags" {
description = "Tags to apply to all resources"
type = map ( string )
default = {}
}
Expose useful outputs:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# modules/ecs-cluster/outputs.tf
output "cluster_id" {
description = "ID of the ECS cluster"
value = aws_ecs_cluster . main . id
}
output "cluster_arn" {
description = "ARN of the ECS cluster"
value = aws_ecs_cluster . main . arn
}
output "security_group_id" {
description = "Security group ID for cluster instances"
value = aws_security_group . cluster . id
}
Module usage:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# environments/prod/main.tf
module "ecs_cluster" {
source = "../../modules/ecs-cluster"
cluster_name = "prod-api"
instance_type = "t3.large"
min_size = 3
max_size = 20
tags = {
Environment = "prod"
Team = "platform"
}
}
Variable Management# Don’t hardcode. Use tfvars files per environment:
1
2
3
4
5
6
7
8
9
10
11
12
13
# environments/prod/terraform.tfvars
environment = "prod"
instance_type = "t3.large"
min_instances = 3
max_instances = 20
enable_logging = true
database_config = {
instance_class = "db.r5.large"
storage_gb = 500
multi_az = true
}
Sensitive values via environment variables:
1
2
3
# Don't put secrets in tfvars
export TF_VAR_database_password = " $DATABASE_PASSWORD "
terraform apply
Or use data sources:
1
2
3
4
5
6
7
8
data "aws_secretsmanager_secret_version" "db_password" {
secret_id = "prod/database/password"
}
resource "aws_db_instance" "main" {
password = data . aws_secretsmanager_secret_version . db_password . secret_string
# ...
}
Data Sources for Loose Coupling# Reference existing resources without tight module dependencies:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Instead of passing VPC ID through every module
data "aws_vpc" "main" {
tags = {
Name = "${var.environment}-vpc"
}
}
data "aws_subnets" "private" {
filter {
name = "vpc-id"
values = [ data . aws_vpc . main . id ]
}
tags = {
Tier = "private"
}
}
resource "aws_ecs_service" "api" {
network_configuration {
subnets = data . aws_subnets . private . ids
}
}
Benefits:
Modules don’t need VPC ID passed explicitly Infrastructure can be queried dynamically Less coupling between state files Lifecycle Rules# Control how Terraform handles resource changes:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
resource "aws_instance" "main" {
ami = data . aws_ami . ubuntu . id
instance_type = var . instance_type
lifecycle {
# Don't destroy before creating replacement
create_before_destroy = true
# Ignore changes made outside Terraform
ignore_changes = [
tags [ "LastModified" ],
]
# Never destroy this resource
prevent_destroy = true
}
}
Common patterns:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Database - never accidentally destroy
lifecycle {
prevent_destroy = true
}
# Auto-scaled instances - ignore count changes
lifecycle {
ignore_changes = [ desired_capacity ]
}
# Blue-green deployments - create new before destroying old
lifecycle {
create_before_destroy = true
}
Workspaces vs Directories# Terraform workspaces provide state isolation within a single configuration:
1
2
3
terraform workspace new staging
terraform workspace select staging
terraform apply
When to use workspaces:
Same configuration, different instances (multi-tenant) Quick environment switching Simple projects When to use directories:
Different configurations per environment Different provider versions needed Team collaboration (less confusion) My recommendation: Use directories for environments. Workspaces are too easy to mess up (“which workspace am I in?”).
Remote State Data Sources# Share outputs between state files without tight coupling:
1
2
3
4
5
6
7
8
# environments/prod/networking/outputs.tf
output "vpc_id" {
value = aws_vpc . main . id
}
output "private_subnet_ids" {
value = aws_subnet . private [ * ]. id
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# environments/prod/application/main.tf
data "terraform_remote_state" "networking" {
backend = "s3"
config = {
bucket = "mycompany-terraform-state"
key = "environments/prod/networking/terraform.tfstate"
region = "us-east-1"
}
}
resource "aws_ecs_service" "api" {
network_configuration {
subnets = data . terraform_remote_state . networking . outputs . private_subnet_ids
}
}
Caveats:
Creates implicit dependency on other state Outputs must be stable (changing them breaks consumers) Consider using SSM Parameter Store instead for more flexibility Moved Blocks# Refactor without destroying resources:
1
2
3
4
5
6
7
# Old: resource "aws_instance" "web" { ... }
# New: module "web" { source = "./modules/web-server" }
moved {
from = aws_instance . web
to = module . web . aws_instance . main
}
Run terraform plan to verify the move, then apply. The resource stays intact.
Import Existing Resources# Bring existing infrastructure under Terraform management:
1
2
3
4
5
# Import existing resource
terraform import aws_instance.web i-1234567890abcdef0
# Generate configuration (Terraform 1.5+)
terraform plan -generate-config-out= generated.tf
Import workflow:
Write resource block with expected arguments Run terraform import Run terraform plan to verify no changes Adjust configuration until plan shows no diff CI/CD Integration# Automate plan and apply:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# .github/workflows/terraform.yml
name : Terraform
on :
pull_request :
paths : [ 'terraform/**' ]
push :
branches : [ main]
paths : [ 'terraform/**' ]
jobs :
plan :
runs-on : ubuntu-latest
steps :
- uses : actions/checkout@v4
- uses : hashicorp/setup-terraform@v3
- name : Terraform Init
run : terraform init
working-directory : terraform/environments/prod
- name : Terraform Plan
run : terraform plan -out=tfplan
working-directory : terraform/environments/prod
- name : Upload Plan
uses : actions/upload-artifact@v4
with :
name : tfplan
path : terraform/environments/prod/tfplan
apply :
needs : plan
if : github.ref == 'refs/heads/main'
runs-on : ubuntu-latest
environment : production
steps :
- uses : actions/checkout@v4
- uses : hashicorp/setup-terraform@v3
- name : Download Plan
uses : actions/download-artifact@v4
with :
name : tfplan
path : terraform/environments/prod
- name : Terraform Apply
run : terraform apply -auto-approve tfplan
working-directory : terraform/environments/prod
Key practices:
Plan on PR, apply on merge to main Use GitHub environments for approval gates Save plan file to ensure apply matches plan Common Mistakes# 1. Not using -target carefully:
1
2
# Dangerous - can leave state inconsistent
terraform apply -target= aws_instance.web
Use only for debugging, never in automation.
2. Forgetting state locking:
Two people run apply simultaneously → state corruption.
Always use DynamoDB locking with S3 backend.
3. Hardcoding provider versions:
1
2
3
4
5
6
7
8
9
# Do this
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
4. Giant monolithic state:
Split by component. Network, compute, and database don’t need to be in the same state file.
Quick Reference# Pattern Use Case Separate directories Environment isolation Remote state Team collaboration Modules Code reuse Data sources Loose coupling Lifecycle rules Resource protection Moved blocks Refactoring
Terraform at scale is about state management and module design. Get those right, and everything else follows.
Infrastructure as code is only as good as your code organization. Treat it like software engineering, because it is.