Terraform State Management: Don't Learn This the Hard Way

Terraform state is both the source of its power and the cause of most Terraform disasters. Get it wrong and you’re recreating production resources at 2 AM. Get it right and infrastructure changes become boring (the good kind). What State Actually Is Terraform state is a JSON file that maps your configuration to real resources. When you write aws_instance.web, Terraform needs to know which actual EC2 instance that refers to. State is that mapping. ...

March 13, 2026 Â· 5 min Â· 1060 words Â· Rob Washington

Terraform Patterns That Scale

Terraform’s getting-started guide shows a single main.tf with everything in it. That works for demos. It doesn’t work when you have 50 resources, 5 environments, and a team making changes simultaneously. These patterns emerge from scaling Terraform across teams and environments—where state conflicts happen, where modules get copied instead of shared, and where “just run terraform apply” becomes terrifying. Project Structure The flat-file approach breaks down fast. Structure by environment and component: ...

March 11, 2026 Â· 10 min Â· 1951 words Â· Rob Washington

Terraform Basics: Infrastructure as Code

Clicking through cloud consoles doesn’t scale. Terraform lets you define infrastructure in code, track changes in git, and deploy the same environment repeatedly. Core Concepts Provider: Plugin for a platform (AWS, GCP, Azure, etc.) Resource: A thing to create (server, database, DNS record) State: Terraform’s record of what exists Plan: Preview of changes before applying Apply: Make the changes happen Basic Workflow 1 2 3 4 terraform init # Download providers terraform plan # Preview changes terraform apply # Create/update resources terraform destroy # Tear everything down First Configuration 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 # main.tf # Configure the AWS provider terraform { required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } } provider "aws" { region = "us-east-1" } # Create an EC2 instance resource "aws_instance" "web" { ami = "ami-0c55b159cbfafe1f0" instance_type = "t3.micro" tags = { Name = "WebServer" } } 1 2 3 terraform init # Downloads AWS provider terraform plan # Shows: 1 to add terraform apply # Creates the instance Variables 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 # variables.tf variable "environment" { description = "Deployment environment" type = string default = "dev" } variable "instance_type" { description = "EC2 instance type" type = string default = "t3.micro" } variable "allowed_ips" { description = "IPs allowed to SSH" type = list(string) default = ["0.0.0.0/0"] } # main.tf - use variables resource "aws_instance" "web" { ami = "ami-0c55b159cbfafe1f0" instance_type = var.instance_type tags = { Name = "web-${var.environment}" Environment = var.environment } } Setting Variables 1 2 3 4 5 6 7 8 9 # Command line terraform apply -var="environment=prod" # File (terraform.tfvars) environment = "prod" instance_type = "t3.small" # Environment variables export TF_VAR_environment="prod" Outputs 1 2 3 4 5 6 7 8 9 10 # outputs.tf output "instance_ip" { description = "Public IP of the instance" value = aws_instance.web.public_ip } output "instance_id" { description = "Instance ID" value = aws_instance.web.id } After apply: ...

March 5, 2026 Â· 7 min Â· 1424 words Â· Rob Washington

Infrastructure as Code with Terraform: A Practical Guide

Clicking through cloud consoles doesn’t scale. Infrastructure as Code (IaC) lets you version, review, and automate your infrastructure just like application code. Terraform has become the de facto standard. Here’s how to use it effectively. The Basics Terraform uses HCL (HashiCorp Configuration Language) to declare resources: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 # main.tf terraform { required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } } provider "aws" { region = "us-east-1" } resource "aws_instance" "web" { ami = "ami-0c55b159cbfafe1f0" instance_type = "t3.micro" tags = { Name = "web-server" } } 1 2 3 4 terraform init # Download providers terraform plan # Preview changes terraform apply # Create resources terraform destroy # Tear down everything State Management Terraform tracks what it created in a state file. Never lose this file. ...

March 4, 2026 Â· 8 min Â· 1505 words Â· Rob Washington

Terraform State Management: Keep Your Infrastructure Sane

Terraform state is the source of truth for your infrastructure. Mess it up and you’ll be manually reconciling resources at 2 AM. Here’s how to manage state properly from day one. What Is State? Terraform state maps your configuration to real resources: 1 2 3 4 5 # main.tf resource "aws_instance" "web" { ami = "ami-12345" instance_type = "t3.micro" } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 // terraform.tfstate (simplified) { "resources": [{ "type": "aws_instance", "name": "web", "instances": [{ "attributes": { "id": "i-0abc123def456", "ami": "ami-12345", "instance_type": "t3.micro" } }] }] } Without state, Terraform doesn’t know aws_instance.web corresponds to i-0abc123def456. It would try to create a new instance every time. ...

March 1, 2026 Â· 7 min Â· 1336 words Â· Rob Washington

Terraform Module Patterns for Reusable Infrastructure

Terraform modules turn infrastructure code from scripts into libraries. Here’s how to design them well. Module Structure m └ o ─ d ─ u l v ├ ├ ├ ├ └ e p ─ ─ ─ ─ ─ s c ─ ─ ─ ─ ─ / / m v o v R a a u e E i r t r A n i p s D . a u i M t b t o E f l s n . e . s m s t . d . f t t f f # # # # # R I O P D e n u r o s p t o c o u p v u u t u i m r t d e c v e n e a v r t s r a a i l r t a u e i b e q o l s u n e i s r e m e n t s Basic Module variables.tf 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 variable "name" { description = "Name prefix for resources" type = string } variable "cidr_block" { description = "CIDR block for VPC" type = string default = "10.0.0.0/16" } variable "azs" { description = "Availability zones" type = list(string) } variable "private_subnets" { description = "Private subnet CIDR blocks" type = list(string) default = [] } variable "public_subnets" { description = "Public subnet CIDR blocks" type = list(string) default = [] } variable "tags" { description = "Tags to apply to resources" type = map(string) default = {} } main.tf 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 resource "aws_vpc" "this" { cidr_block = var.cidr_block enable_dns_hostnames = true enable_dns_support = true tags = merge(var.tags, { Name = var.name }) } resource "aws_subnet" "private" { count = length(var.private_subnets) vpc_id = aws_vpc.this.id cidr_block = var.private_subnets[count.index] availability_zone = var.azs[count.index % length(var.azs)] tags = merge(var.tags, { Name = "${var.name}-private-${count.index + 1}" Type = "private" }) } resource "aws_subnet" "public" { count = length(var.public_subnets) vpc_id = aws_vpc.this.id cidr_block = var.public_subnets[count.index] availability_zone = var.azs[count.index % length(var.azs)] map_public_ip_on_launch = true tags = merge(var.tags, { Name = "${var.name}-public-${count.index + 1}" Type = "public" }) } outputs.tf 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 output "vpc_id" { description = "VPC ID" value = aws_vpc.this.id } output "private_subnet_ids" { description = "Private subnet IDs" value = aws_subnet.private[*].id } output "public_subnet_ids" { description = "Public subnet IDs" value = aws_subnet.public[*].id } output "vpc_cidr_block" { description = "VPC CIDR block" value = aws_vpc.this.cidr_block } versions.tf 1 2 3 4 5 6 7 8 9 10 terraform { required_version = ">= 1.0" required_providers { aws = { source = "hashicorp/aws" version = ">= 4.0" } } } Using Modules Local Module 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 module "vpc" { source = "./modules/vpc" name = "production" cidr_block = "10.0.0.0/16" azs = ["us-east-1a", "us-east-1b", "us-east-1c"] private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"] public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"] tags = { Environment = "production" Project = "myapp" } } # Use outputs resource "aws_instance" "app" { subnet_id = module.vpc.private_subnet_ids[0] # ... } Registry Module 1 2 3 4 5 6 7 8 9 10 11 12 13 14 module "vpc" { source = "terraform-aws-modules/vpc/aws" version = "5.0.0" name = "my-vpc" cidr = "10.0.0.0/16" azs = ["us-east-1a", "us-east-1b"] private_subnets = ["10.0.1.0/24", "10.0.2.0/24"] public_subnets = ["10.0.101.0/24", "10.0.102.0/24"] enable_nat_gateway = true single_nat_gateway = true } Git Module 1 2 3 4 module "vpc" { source = "git::https://github.com/org/terraform-modules.git//vpc?ref=v1.2.0" # ... } Variable Validation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 variable "environment" { description = "Environment name" type = string validation { condition = contains(["dev", "staging", "production"], var.environment) error_message = "Environment must be dev, staging, or production." } } variable "instance_type" { description = "EC2 instance type" type = string default = "t3.micro" validation { condition = can(regex("^t3\\.", var.instance_type)) error_message = "Instance type must be t3 family." } } variable "cidr_block" { description = "CIDR block" type = string validation { condition = can(cidrhost(var.cidr_block, 0)) error_message = "Must be a valid CIDR block." } } Complex Types 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 variable "services" { description = "Service configurations" type = list(object({ name = string port = number health_path = optional(string, "/health") replicas = optional(number, 1) environment = optional(map(string), {}) })) } # Usage services = [ { name = "api" port = 8080 replicas = 3 environment = { LOG_LEVEL = "info" } }, { name = "worker" port = 9090 } ] Conditional Resources 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 variable "create_nat_gateway" { description = "Create NAT gateway" type = bool default = true } resource "aws_nat_gateway" "this" { count = var.create_nat_gateway ? 1 : 0 allocation_id = aws_eip.nat[0].id subnet_id = aws_subnet.public[0].id } resource "aws_eip" "nat" { count = var.create_nat_gateway ? 1 : 0 domain = "vpc" } Dynamic Blocks 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 variable "ingress_rules" { description = "Ingress rules" type = list(object({ port = number protocol = string cidr_blocks = list(string) })) default = [] } resource "aws_security_group" "this" { name = var.name description = var.description vpc_id = var.vpc_id dynamic "ingress" { for_each = var.ingress_rules content { from_port = ingress.value.port to_port = ingress.value.port protocol = ingress.value.protocol cidr_blocks = ingress.value.cidr_blocks } } egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } } Module Composition Root Module 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 # main.tf module "vpc" { source = "./modules/vpc" # ... } module "security_groups" { source = "./modules/security-groups" vpc_id = module.vpc.vpc_id # ... } module "ecs_cluster" { source = "./modules/ecs-cluster" vpc_id = module.vpc.vpc_id private_subnet_ids = module.vpc.private_subnet_ids security_group_ids = [module.security_groups.app_sg_id] # ... } Shared Data 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # data.tf - Shared data sources data "aws_caller_identity" "current" {} data "aws_region" "current" {} locals { account_id = data.aws_caller_identity.current.account_id region = data.aws_region.current.name common_tags = { Environment = var.environment Project = var.project ManagedBy = "terraform" } } For_each vs Count Count (Index-based) 1 2 3 4 5 # Problematic: Adding/removing items shifts indices resource "aws_subnet" "private" { count = length(var.private_subnets) cidr_block = var.private_subnets[count.index] } For_each (Key-based) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 # Better: Keys are stable variable "subnets" { type = map(object({ cidr_block = string az = string })) } resource "aws_subnet" "private" { for_each = var.subnets cidr_block = each.value.cidr_block availability_zone = each.value.az tags = { Name = each.key } } # Usage subnets = { "private-1a" = { cidr_block = "10.0.1.0/24", az = "us-east-1a" } "private-1b" = { cidr_block = "10.0.2.0/24", az = "us-east-1b" } } Sensitive Values 1 2 3 4 5 6 7 8 9 10 11 variable "database_password" { description = "Database password" type = string sensitive = true } output "connection_string" { description = "Database connection string" value = "postgres://user:${var.database_password}@${aws_db_instance.this.endpoint}/db" sensitive = true } Module Testing Terraform Test (1.6+) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # tests/vpc_test.tftest.hcl run "vpc_creates_successfully" { command = plan variables { name = "test-vpc" cidr_block = "10.0.0.0/16" azs = ["us-east-1a"] } assert { condition = aws_vpc.this.cidr_block == "10.0.0.0/16" error_message = "VPC CIDR block incorrect" } } Run tests: ...

February 28, 2026 Â· 8 min Â· 1602 words Â· Rob Washington

Terraform State Management: Avoiding the Footguns

Terraform state is both essential and dangerous. It’s how Terraform knows what exists, what changed, and what to do. Mismanage it, and you’ll either destroy production or spend hours untangling drift. What State Actually Is State is Terraform’s record of reality. It maps your configuration to real resources: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 { "resources": [ { "type": "aws_instance", "name": "web", "instances": [ { "attributes": { "id": "i-0abc123def456", "ami": "ami-12345678", "instance_type": "t3.medium" } } ] } ] } Without state, Terraform would: ...

February 24, 2026 Â· 7 min Â· 1386 words Â· Rob Washington

Infrastructure as Code: Principles That Actually Matter

Infrastructure as Code (IaC) means your servers, networks, and services are defined in version-controlled files rather than clicked into existence through consoles. The benefits are obvious: reproducibility, auditability, collaboration. But IaC done poorly creates its own problems: state drift, copy-paste sprawl, untestable configurations. The principles matter more than the tools. Declarative Over Imperative Describe what you want, not how to get there: 1 2 3 4 5 6 7 8 9 # Declarative (Terraform) - what resource "aws_instance" "web" { ami = "ami-0c55b159cbfafe1f0" instance_type = "t3.micro" tags = { Name = "web-server" } } 1 2 3 4 5 # Imperative (script) - how aws ec2 run-instances \ --image-id ami-0c55b159cbfafe1f0 \ --instance-type t3.micro \ --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=web-server}]' Declarative code is idempotent — run it ten times, get the same result. Imperative scripts need guards against re-running. ...

February 23, 2026 Â· 6 min Â· 1251 words Â· Rob Washington

Terraform State Management: Avoiding the Footguns

Terraform state is where infrastructure-as-code meets reality. It’s also where most Terraform disasters originate. Here’s how to manage state without losing sleep. The Problem Terraform tracks what it’s created in a state file. This file maps your HCL resources to real infrastructure. Without it, Terraform can’t update or destroy anything — it doesn’t know what exists. The default is a local file called terraform.tfstate. This works fine until: Someone else needs to run Terraform Your laptop dies Two people run apply simultaneously You accidentally commit secrets to Git Rule 1: Remote State from Day One Never use local state for anything beyond experiments: ...

February 22, 2026 Â· 6 min Â· 1210 words Â· Rob Washington

Ansible Playbook Patterns: Writing Maintainable Infrastructure Code

Ansible playbooks can quickly become unwieldy spaghetti. Here are battle-tested patterns for writing infrastructure code that scales with your team and your infrastructure. The Role Structure That Actually Works Forget the minimal examples. Real roles need this structure: r └ o ─ l ─ e s w ├ │ ├ │ ├ │ │ │ │ ├ │ ├ │ ├ │ └ / e ─ ─ ─ ─ ─ ─ ─ b ─ ─ ─ ─ ─ ─ ─ s e d └ v └ t ├ ├ ├ └ h └ t └ f └ m └ r e ─ a ─ a ─ ─ ─ ─ a ─ e ─ i ─ e ─ v f ─ r ─ s ─ ─ ─ ─ n ─ m ─ l ─ t ─ e a s k d p e a r u m / m s m i c s l m l n s s / m / l a a / a n o e e a a g / s a t i i i s n r r i t i l i s n n n t f v s n e n - n / . . . a i i / . s x p . y y y l g c y / . a y m m m l u e m c r m l l l . r . l o a l y e y n m m . m f s l y l . . m j c l 2 o n # # # # # # # f # D R E P C S R D e o n a o e e e f l t c n r s p a e r k f v t e u y a i i a n l v g g c r d t a p e u e t e r o r / n v i i i a m r c a a n n t a e i r b t s i n l e i l t o a o s a e - a n g a b s l e d l j l f m e ( u a i e h s h s t l n a i t i e t n ( g o s d l h i n l o e n e w r c r e l s s p u t r d e e p c s r e e d c e e n d c e e n ) c e ) The key insight: tasks/main.yml should only contain includes: ...

February 17, 2026 Â· 7 min Â· 1328 words Â· Rob Washington