Your cloud bill is too high. It always is. Here’s how to actually reduce it without breaking things.
Quick Wins#
1. Find Unused Resources#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| # AWS: Find unattached EBS volumes
aws ec2 describe-volumes \
--filters Name=status,Values=available \
--query 'Volumes[*].[VolumeId,Size,CreateTime]' \
--output table
# Find old snapshots (>90 days)
aws ec2 describe-snapshots --owner-ids self \
--query 'Snapshots[?StartTime<=`2026-01-01`].[SnapshotId,VolumeSize,StartTime]' \
--output table
# Find unattached Elastic IPs
aws ec2 describe-addresses \
--query 'Addresses[?AssociationId==`null`].[PublicIp,AllocationId]' \
--output table
|
Delete them. Unattached EBS volumes cost money. Unused EIPs cost $3.65/month each.
2. Right-Size Instances#
1
2
3
4
5
6
7
8
9
| # Check CPU utilization
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--dimensions Name=InstanceId,Value=i-1234567890abcdef0 \
--start-time 2026-03-01T00:00:00Z \
--end-time 2026-03-11T00:00:00Z \
--period 86400 \
--statistics Average
|
If average CPU < 20%, downsize. A t3.large at 10% CPU should be a t3.small.
3. Use Spot Instances#
For fault-tolerant workloads:
1
2
3
4
5
6
7
8
9
| # Terraform
resource "aws_spot_instance_request" "worker" {
ami = "ami-12345678"
instance_type = "c5.xlarge"
spot_price = "0.10" # Max you'll pay
# Spot instances can be interrupted
instance_interruption_behavior = "terminate"
}
|
Savings: 60-90% vs on-demand.
4. Reserved Instances / Savings Plans#
For predictable workloads:
| Commitment | Discount |
|---|
| No commitment | 0% |
| 1 year, no upfront | ~30% |
| 1 year, all upfront | ~40% |
| 3 year, all upfront | ~60% |
1
2
3
4
| # Check reservation coverage
aws ce get-reservation-coverage \
--time-period Start=2026-03-01,End=2026-03-11 \
--group-by Type=DIMENSION,Key=SERVICE
|
Storage Optimization#
S3 Lifecycle Policies#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| {
"Rules": [
{
"ID": "MoveToIA",
"Status": "Enabled",
"Filter": {"Prefix": "logs/"},
"Transitions": [
{"Days": 30, "StorageClass": "STANDARD_IA"},
{"Days": 90, "StorageClass": "GLACIER"},
{"Days": 365, "StorageClass": "DEEP_ARCHIVE"}
],
"Expiration": {"Days": 730}
}
]
}
|
| Storage Class | Cost (per GB/month) | Use Case |
|---|
| Standard | $0.023 | Frequent access |
| Standard-IA | $0.0125 | Infrequent access |
| Glacier | $0.004 | Archive (minutes retrieval) |
| Deep Archive | $0.00099 | Long-term archive (hours) |
S3 Intelligent Tiering#
Let AWS optimize for you:
1
2
| aws s3 cp myfile.txt s3://mybucket/ \
--storage-class INTELLIGENT_TIERING
|
Automatically moves objects between tiers based on access patterns.
EBS Optimization#
1
2
3
4
5
6
7
| # Find over-provisioned volumes
aws ec2 describe-volumes \
--query 'Volumes[?Size>`100`].[VolumeId,Size,VolumeType,Iops]'
# gp3 is usually cheaper than gp2
# gp2: $0.10/GB + IOPS scale with size
# gp3: $0.08/GB + $0.005/IOPS (configurable)
|
Migrate gp2 to gp3:
1
2
3
4
5
| aws ec2 modify-volume \
--volume-id vol-1234567890abcdef0 \
--volume-type gp3 \
--iops 3000 \
--throughput 125
|
Compute Optimization#
Auto Scaling#
Scale down when you don’t need capacity:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| # Terraform
resource "aws_autoscaling_schedule" "scale_down_night" {
scheduled_action_name = "scale-down-night"
autoscaling_group_name = aws_autoscaling_group.app.name
min_size = 1
max_size = 2
desired_capacity = 1
recurrence = "0 22 * * *" # 10 PM
}
resource "aws_autoscaling_schedule" "scale_up_morning" {
scheduled_action_name = "scale-up-morning"
autoscaling_group_name = aws_autoscaling_group.app.name
min_size = 2
max_size = 10
desired_capacity = 4
recurrence = "0 6 * * 1-5" # 6 AM weekdays
}
|
Lambda Right-Sizing#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| # Check if you're over-provisioned
import boto3
client = boto3.client('cloudwatch')
response = client.get_metric_statistics(
Namespace='AWS/Lambda',
MetricName='Duration',
Dimensions=[{'Name': 'FunctionName', 'Value': 'my-function'}],
StartTime='2026-03-01',
EndTime='2026-03-11',
Period=86400,
Statistics=['Average', 'Maximum']
)
# If max duration << timeout, reduce memory
# Lambda CPU scales with memory
|
Use AWS Lambda Power Tuning to find optimal memory:
1
| # https://github.com/alexcasalboni/aws-lambda-power-tuning
|
Container Optimization#
1
2
3
4
5
6
7
8
| # Set resource limits to avoid over-provisioning
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
|
Use Karpenter for Kubernetes node optimization:
1
2
3
4
5
6
7
8
9
10
11
12
| apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"] # Prefer spot
- key: node.kubernetes.io/instance-type
operator: In
values: ["t3.medium", "t3.large", "t3.xlarge"]
|
Database Optimization#
RDS Right-Sizing#
1
2
3
4
5
6
7
| -- Check if you need that big instance
SHOW STATUS LIKE 'Max_used_connections';
-- If max << max_connections, downsize
-- Check buffer pool usage (MySQL)
SHOW STATUS LIKE 'Innodb_buffer_pool%';
-- If pages_free is high, reduce instance size
|
Aurora Serverless v2#
Pay for what you use:
1
2
3
4
5
6
7
8
9
| resource "aws_rds_cluster" "aurora" {
engine = "aurora-postgresql"
engine_mode = "provisioned"
serverlessv2_scaling_configuration {
min_capacity = 0.5 # Scale to near-zero
max_capacity = 16
}
}
|
Read Replicas#
Offload reads to cheaper replicas:
1
2
3
4
5
6
7
| # Write to primary
write_db = connect(primary_endpoint)
write_db.execute("INSERT INTO users ...")
# Read from replica
read_db = connect(replica_endpoint)
users = read_db.execute("SELECT * FROM users")
|
Monitoring Costs#
AWS Cost Explorer API#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
| import boto3
client = boto3.client('ce')
response = client.get_cost_and_usage(
TimePeriod={
'Start': '2026-03-01',
'End': '2026-03-11'
},
Granularity='DAILY',
Metrics=['UnblendedCost'],
GroupBy=[
{'Type': 'DIMENSION', 'Key': 'SERVICE'}
]
)
for day in response['ResultsByTime']:
print(f"{day['TimePeriod']['Start']}:")
for group in day['Groups']:
service = group['Keys'][0]
cost = group['Metrics']['UnblendedCost']['Amount']
print(f" {service}: ${float(cost):.2f}")
|
Budget Alerts#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| # Terraform
resource "aws_budgets_budget" "monthly" {
name = "monthly-budget"
budget_type = "COST"
limit_amount = "1000"
limit_unit = "USD"
time_unit = "MONTHLY"
notification {
comparison_operator = "GREATER_THAN"
threshold = 80
threshold_type = "PERCENTAGE"
notification_type = "ACTUAL"
subscriber_email_addresses = ["alerts@example.com"]
}
}
|
Tag everything:
1
2
3
4
5
6
7
8
9
10
| resource "aws_instance" "web" {
# ...
tags = {
Environment = "production"
Team = "platform"
Project = "api"
CostCenter = "engineering"
}
}
|
Then filter costs by tag:
1
2
3
4
5
| aws ce get-cost-and-usage \
--time-period Start=2026-03-01,End=2026-03-11 \
--granularity MONTHLY \
--metrics UnblendedCost \
--group-by Type=TAG,Key=Project
|
The Cost Optimization Checklist#
Weekly:
Monthly:
Quarterly:
Quick Reference: Savings by Action#
| Action | Typical Savings |
|---|
| Delete unused resources | 5-15% |
| Right-size instances | 10-30% |
| Reserved Instances (1yr) | 30-40% |
| Spot Instances | 60-90% |
| S3 lifecycle policies | 40-70% on storage |
| Scheduled scaling | 20-40% |
| gp2 → gp3 migration | 20% on EBS |
Start with the quick wins. The biggest savings usually come from things you’re not using at all.
Cloud costs are like subscriptions — they accumulate quietly until you look at the bill. Look at the bill regularly.