Terraform state is the source of truth for your infrastructure. Mess it up, and you’re in for a bad day. Here’s how to manage it properly.
Why State Matters# Terraform tracks the mapping between your configuration and real infrastructure in a state file. Without it, Terraform doesn’t know:
What resources it created What the current configuration looks like What needs to change on the next apply The default terraform.tfstate file is local, which breaks immediately when you have a team.
Remote Backend: S3 + DynamoDB# The gold standard for AWS environments:
1
2
3
4
5
6
7
8
9
10
# backend.tf
terraform {
backend "s3" {
bucket = "mycompany-terraform-state"
key = "prod/networking/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-state-lock"
}
}
Create the backend infrastructure first (chicken-and-egg problem):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# bootstrap/main.tf - Run this once with local state
provider "aws" {
region = "us-east-1"
}
resource "aws_s3_bucket" "terraform_state" {
bucket = "mycompany-terraform-state"
lifecycle {
prevent_destroy = true
}
}
resource "aws_s3_bucket_versioning" "terraform_state" {
bucket = aws_s3_bucket . terraform_state . id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state" {
bucket = aws_s3_bucket . terraform_state . id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
}
}
}
resource "aws_dynamodb_table" "terraform_lock" {
name = "terraform-state-lock"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}
State Locking# DynamoDB provides locking to prevent concurrent modifications:
U U s s e → → → e → → → r r A M R A F W A c a e B t A a : q k l : t I i u e e e L t t i s a t m S s e r s e p : r e c e r t o r s h s r s " r a a a E f l n l f t r u o o g o o o r s r c e c r o e m k s k m a r s c a o a q a - p n p u c l p p i q o l " l r u c y p y e i k r r = o ( l i f d w o n a / h c g l n i k s e l t e t e h w e ( o A d r s a k i t n i s a g n t e g r e r / u o t n l u e n o s r i c ) r n k a g " f ) o r m . t f s t a t e "
Never use -lock=false in production. If you’re stuck with a stale lock:
1
2
# Find the lock ID from the error message
terraform force-unlock LOCK_ID
Workspace Strategy# Workspaces let you manage multiple environments with the same configuration:
1
2
3
4
5
6
7
8
9
10
# Create workspaces
terraform workspace new dev
terraform workspace new staging
terraform workspace new prod
# Switch workspaces
terraform workspace select prod
# List workspaces
terraform workspace list
Reference the workspace in your config:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
locals {
environment = terraform . workspace
instance_type = {
dev = "t3.micro"
staging = "t3.small"
prod = "t3.large"
}
}
resource "aws_instance" "app" {
instance_type = local . instance_type [ local . environment ]
tags = {
Environment = local . environment
}
}
State files are stored with workspace prefix:
s 3 : e e e / n n n / v v v m : : : y / / / c d s p o e t r m v a o p / g d a n i / n e n n y t g e - w t t n w e r e o r k t r r i w k a n o i f g r n o / k g r t i / m e n t - r g e s r / r t a t r a f e a t o r f e r r o / m a r . f m t o . f r t s m f t . s a t t t f a e s t t e a t e State File Structure# For large infrastructures, split state by component:
i ├ │ │ ├ │ │ ├ │ │ └ n ─ ─ ─ ─ f ─ ─ ─ ─ r a n ├ └ d ├ └ c ├ └ m ├ └ s e ─ ─ a ─ ─ o ─ ─ o ─ ─ t t ─ ─ t ─ ─ m ─ ─ n ─ ─ r w a p i u o b m b b m u b m t b m c r a a a a a t a a o a a t k c i s c i e c i r c i u i k n e k n k n i k n r n e . / e . e . n e . e g n t n t n t g n t / / d f d f d f / d f . . . . t t t t f f f f # # # # k k k k e e e e y y y y = = = = " " " " p p p p r r r r o o o o d d d d / / / / n d c m e a o o t t m n w a p i o b u t r a t o k s e r i e / i n / t n g t e g / e r / t r r t e r a e r a f r r f o r a o r a f r m f o m . o r . t r m t f m . f s . t s t t f t a f s a t s t t e t a e " a t " t e e " " Use terraform_remote_state to reference outputs across states:
1
2
3
4
5
6
7
8
9
10
11
12
13
# In compute/main.tf
data "terraform_remote_state" "networking" {
backend = "s3"
config = {
bucket = "mycompany-terraform-state"
key = "prod/networking/terraform.tfstate"
region = "us-east-1"
}
}
resource "aws_instance" "app" {
subnet_id = data . terraform_remote_state . networking . outputs . private_subnet_id
}
State Recovery# Recovering from S3 Versioning# 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# List versions
aws s3api list-object-versions \
--bucket mycompany-terraform-state \
--prefix prod/networking/terraform.tfstate
# Download a previous version
aws s3api get-object \
--bucket mycompany-terraform-state \
--key prod/networking/terraform.tfstate \
--version-id "abc123" \
terraform.tfstate.backup
# Review it, then restore
aws s3 cp terraform.tfstate.backup \
s3://mycompany-terraform-state/prod/networking/terraform.tfstate
Importing Existing Resources# When resources exist but aren’t in state:
1
2
3
4
5
6
7
8
# Import an existing EC2 instance
terraform import aws_instance.app i-1234567890abcdef0
# Import an S3 bucket
terraform import aws_s3_bucket.logs my-logging-bucket
# Import with module path
terraform import module.vpc.aws_vpc.main vpc-abc123
Removing Resources from State# When you want to stop managing a resource without destroying it:
1
2
3
4
5
# Remove from state (resource continues to exist)
terraform state rm aws_instance.legacy_server
# Move resource to different state address
terraform state mv aws_instance.old aws_instance.new
State Inspection# 1
2
3
4
5
6
7
8
9
10
11
# List all resources in state
terraform state list
# Show details of a specific resource
terraform state show aws_instance.app
# Pull remote state to local file (for inspection)
terraform state pull > state.json
# Push local state to remote (dangerous - use carefully)
terraform state push state.json
Sensitive Data in State# State files contain sensitive data (passwords, keys, etc.) in plaintext. Protect them:
Encrypt at rest : Use S3 SSE or KMSEncrypt in transit : S3 uses HTTPS by defaultRestrict access : IAM policies on the S3 bucketNever commit state : Add *.tfstate* to .gitignore 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# IAM policy for state bucket
{
"Version": "2012-10-17" ,
"Statement" : [
{
"Effect": "Allow" ,
"Action": ["s3:GetObject", "s3:PutObject" ],
"Resource": "arn:aws:s3:::mycompany-terraform-state/*" ,
"Condition" : {
"StringEquals" : {
"aws:PrincipalTag/Team": "platform"
}
}
}
]
}
CI/CD Integration# In pipelines, handle state carefully:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# .github/workflows/terraform.yml
jobs :
terraform :
runs-on : ubuntu-latest
steps :
- uses : actions/checkout@v4
- name : Configure AWS credentials
uses : aws-actions/configure-aws-credentials@v4
with :
role-to-assume : arn:aws:iam::123456789:role/terraform-ci
aws-region : us-east-1
- name : Terraform Init
run : terraform init -backend-config="key=prod/${{ github.event.inputs.component }}/terraform.tfstate"
- name : Terraform Plan
run : terraform plan -out=tfplan
- name : Terraform Apply
if : github.ref == 'refs/heads/main'
run : terraform apply -auto-approve tfplan
Key Takeaways# Always use remote state for team environmentsEnable versioning on your state bucket — it will save youUse locking to prevent concurrent modificationsSplit state by component for large infrastructuresTreat state as sensitive — encrypt and restrict accessPractice recovery before you need itState management isn’t glamorous, but getting it right prevents disasters.
More infrastructure patterns: Custom GitHub Actions for CI/CD automation.