Terraform state is the source of truth for your infrastructure. Mess it up, and you’re in for a bad day. Here’s how to manage it properly.

Why State Matters

Terraform tracks the mapping between your configuration and real infrastructure in a state file. Without it, Terraform doesn’t know:

  • What resources it created
  • What the current configuration looks like
  • What needs to change on the next apply

The default terraform.tfstate file is local, which breaks immediately when you have a team.

Remote Backend: S3 + DynamoDB

The gold standard for AWS environments:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# backend.tf
terraform {
  backend "s3" {
    bucket         = "mycompany-terraform-state"
    key            = "prod/networking/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-lock"
  }
}

Create the backend infrastructure first (chicken-and-egg problem):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# bootstrap/main.tf - Run this once with local state
provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "terraform_state" {
  bucket = "mycompany-terraform-state"

  lifecycle {
    prevent_destroy = true
  }
}

resource "aws_s3_bucket_versioning" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "aws:kms"
    }
  }
}

resource "aws_dynamodb_table" "terraform_lock" {
  name         = "terraform-state-lock"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }
}

State Locking

DynamoDB provides locking to prevent concurrent modifications:

UUsseerrAMRAFWAcaeBtAa:qkl:tIiueeeLttisatmSsersep:recertorshsrs"raaaEflnlftruoogooorsrcecroemkskmarscaoaqa-pnpuclppiqol"lrucypyeikrr=o(lifdwona/hcglnikseltetehwe(oAdrsakitnisagntegrer/uotnluenosric)rnkag"f)orm.tfstate"

Never use -lock=false in production. If you’re stuck with a stale lock:

1
2
# Find the lock ID from the error message
terraform force-unlock LOCK_ID

Workspace Strategy

Workspaces let you manage multiple environments with the same configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Create workspaces
terraform workspace new dev
terraform workspace new staging
terraform workspace new prod

# Switch workspaces
terraform workspace select prod

# List workspaces
terraform workspace list

Reference the workspace in your config:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
locals {
  environment = terraform.workspace
  
  instance_type = {
    dev     = "t3.micro"
    staging = "t3.small"
    prod    = "t3.large"
  }
}

resource "aws_instance" "app" {
  instance_type = local.instance_type[local.environment]
  
  tags = {
    Environment = local.environment
  }
}

State files are stored with workspace prefix:

s3:eee/nnn/vvvm:::y///cdspoetrmvaop/gdani/nennytge-wttnwereorktrriwkanoifgrno/kgrti/ment-rgesr/rtatrafeatorferro/mar.fmto.frtsmft.satttfaestteate

State File Structure

For large infrastructures, split state by component:

infrandcmseaootttmnrwapiuobmbbmubmtbmcraaaaataaoaatkcisciecirciuikneknkniknrne./e.e.ne.egntntntgnt//dfdfdf/df....ttttffff####kkkkeeeeyyyy====""""pppprrrroooodddd////ndcmeaoottmnwapiobutratokserie/in/tngteg/er/trrteraerafrrforaorafrmfom.or.trmtfm.fs.tsttftafsatsttetae"at"tee""

Use terraform_remote_state to reference outputs across states:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# In compute/main.tf
data "terraform_remote_state" "networking" {
  backend = "s3"
  config = {
    bucket = "mycompany-terraform-state"
    key    = "prod/networking/terraform.tfstate"
    region = "us-east-1"
  }
}

resource "aws_instance" "app" {
  subnet_id = data.terraform_remote_state.networking.outputs.private_subnet_id
}

State Recovery

Recovering from S3 Versioning

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# List versions
aws s3api list-object-versions \
  --bucket mycompany-terraform-state \
  --prefix prod/networking/terraform.tfstate

# Download a previous version
aws s3api get-object \
  --bucket mycompany-terraform-state \
  --key prod/networking/terraform.tfstate \
  --version-id "abc123" \
  terraform.tfstate.backup

# Review it, then restore
aws s3 cp terraform.tfstate.backup \
  s3://mycompany-terraform-state/prod/networking/terraform.tfstate

Importing Existing Resources

When resources exist but aren’t in state:

1
2
3
4
5
6
7
8
# Import an existing EC2 instance
terraform import aws_instance.app i-1234567890abcdef0

# Import an S3 bucket
terraform import aws_s3_bucket.logs my-logging-bucket

# Import with module path
terraform import module.vpc.aws_vpc.main vpc-abc123

Removing Resources from State

When you want to stop managing a resource without destroying it:

1
2
3
4
5
# Remove from state (resource continues to exist)
terraform state rm aws_instance.legacy_server

# Move resource to different state address
terraform state mv aws_instance.old aws_instance.new

State Inspection

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# List all resources in state
terraform state list

# Show details of a specific resource
terraform state show aws_instance.app

# Pull remote state to local file (for inspection)
terraform state pull > state.json

# Push local state to remote (dangerous - use carefully)
terraform state push state.json

Sensitive Data in State

State files contain sensitive data (passwords, keys, etc.) in plaintext. Protect them:

  1. Encrypt at rest: Use S3 SSE or KMS
  2. Encrypt in transit: S3 uses HTTPS by default
  3. Restrict access: IAM policies on the S3 bucket
  4. Never commit state: Add *.tfstate* to .gitignore
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# IAM policy for state bucket
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:PutObject"],
      "Resource": "arn:aws:s3:::mycompany-terraform-state/*",
      "Condition": {
        "StringEquals": {
          "aws:PrincipalTag/Team": "platform"
        }
      }
    }
  ]
}

CI/CD Integration

In pipelines, handle state carefully:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# .github/workflows/terraform.yml
jobs:
  terraform:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789:role/terraform-ci
          aws-region: us-east-1
      
      - name: Terraform Init
        run: terraform init -backend-config="key=prod/${{ github.event.inputs.component }}/terraform.tfstate"
      
      - name: Terraform Plan
        run: terraform plan -out=tfplan
        
      - name: Terraform Apply
        if: github.ref == 'refs/heads/main'
        run: terraform apply -auto-approve tfplan

Key Takeaways

  1. Always use remote state for team environments
  2. Enable versioning on your state bucket — it will save you
  3. Use locking to prevent concurrent modifications
  4. Split state by component for large infrastructures
  5. Treat state as sensitive — encrypt and restrict access
  6. Practice recovery before you need it

State management isn’t glamorous, but getting it right prevents disasters.


More infrastructure patterns: Custom GitHub Actions for CI/CD automation.