Schema changes are scary. One bad migration can take down your entire application. Here’s how to evolve your database safely, even with users actively hitting it.

The Core Problem

You need to rename a column. Sounds simple:

1
ALTER TABLE users RENAME COLUMN name TO full_name;

But your application is running. The moment this executes:

  • Old code looking for name breaks
  • New code looking for full_name also breaks (until deploy finishes)
  • You have a window of guaranteed failures

This is why database migrations need careful choreography.

The Expand-Contract Pattern

The safest approach: never make destructive changes directly. Instead:

  1. Expand — Add new structure alongside old
  2. Migrate — Move data, update code
  3. Contract — Remove old structure

Example: Renaming a Column

Step 1: Expand (add new column)

1
2
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);
UPDATE users SET full_name = name WHERE full_name IS NULL;

Both columns exist. Old code works. New code can start using full_name.

Step 2: Dual-write (application change)

1
2
3
4
5
6
7
# Write to both columns
user.name = value
user.full_name = value
db.save(user)

# Read from new column, fall back to old
name = user.full_name or user.name

Deploy this. All new writes go to both columns.

Step 3: Backfill

1
UPDATE users SET full_name = name WHERE full_name IS NULL;

Now all rows have full_name populated.

Step 4: Switch reads (application change)

1
2
# Read only from new column
name = user.full_name

Deploy this. Old column is now unused.

Step 5: Contract (remove old column)

1
ALTER TABLE users DROP COLUMN name;

Done. Zero downtime through the entire process.

Migrations That Lock Tables

Some operations lock the entire table, blocking all reads and writes:

Dangerous operations:

  • ALTER TABLE ... ADD COLUMN ... DEFAULT value (pre-Postgres 11)
  • CREATE INDEX without CONCURRENTLY
  • ALTER TABLE ... ALTER COLUMN ... TYPE (data conversion)
  • Large UPDATE statements

Safe Index Creation

1
2
3
4
5
-- Bad: Locks table for entire build
CREATE INDEX idx_users_email ON users(email);

-- Good: Builds in background
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);

CONCURRENTLY takes longer but doesn’t block writes.

Safe Column Additions (Postgres 11+)

1
2
3
4
5
6
7
8
-- Postgres 11+: Instant, no rewrite
ALTER TABLE users ADD COLUMN status VARCHAR(20) DEFAULT 'active';

-- Pre-11: Rewrites entire table (bad)
-- Use this instead:
ALTER TABLE users ADD COLUMN status VARCHAR(20);
UPDATE users SET status = 'active' WHERE status IS NULL;
ALTER TABLE users ALTER COLUMN status SET DEFAULT 'active';

Batched Updates

Large updates should be batched:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- Bad: Locks millions of rows
UPDATE users SET migrated = true;

-- Good: Process in chunks
DO $$
DECLARE
    batch_size INT := 10000;
    rows_updated INT;
BEGIN
    LOOP
        UPDATE users 
        SET migrated = true 
        WHERE id IN (
            SELECT id FROM users 
            WHERE migrated IS NULL 
            LIMIT batch_size
            FOR UPDATE SKIP LOCKED
        );
        
        GET DIAGNOSTICS rows_updated = ROW_COUNT;
        EXIT WHEN rows_updated = 0;
        
        COMMIT;
        PERFORM pg_sleep(0.1);  -- Brief pause
    END LOOP;
END $$;

Migration Tooling

Framework Migrations

Most frameworks have built-in migrations:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Django
class Migration(migrations.Migration):
    operations = [
        migrations.AddField(
            model_name='user',
            name='full_name',
            field=models.CharField(max_length=255, null=True),
        ),
    ]

# Run with: python manage.py migrate
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// Knex.js
exports.up = function(knex) {
    return knex.schema.alterTable('users', table => {
        table.string('full_name', 255);
    });
};

exports.down = function(knex) {
    return knex.schema.alterTable('users', table => {
        table.dropColumn('full_name');
    });
};

Standalone Tools

For more control:

Flyway — SQL-based, version controlled

VVV123______cadrdredoa_ptf_eun_laulms_een_racsmo.els.uqsmlqnl.sql

golang-migrate — Language agnostic

1
migrate -database postgres://... -path ./migrations up

gh-ost (MySQL) — Online schema changes

1
2
3
4
gh-ost \
  --alter="ADD COLUMN full_name VARCHAR(255)" \
  --table=users \
  --execute

Deployment Ordering

The order matters:

Adding a Column

  1. Deploy migration (add column)
  2. Deploy application (code that uses new column)

Removing a Column

  1. Deploy application (remove all references to column)
  2. Deploy migration (drop column)

Renaming (Expand-Contract)

  1. Deploy migration (add new column)
  2. Deploy application (dual-write)
  3. Deploy migration (backfill)
  4. Deploy application (read from new)
  5. Deploy migration (drop old column)

Testing Migrations

Local Testing

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Apply all migrations to fresh DB
docker-compose up -d postgres
./migrate up

# Seed with production-like data
./seed-data.sh

# Test the problematic migration
./migrate up +1

# Verify
psql -c "SELECT * FROM users LIMIT 5;"

Staging with Production Data

For critical migrations, test against production-scale data:

1
2
3
4
5
6
7
# Snapshot production (anonymized)
pg_dump prod | ./anonymize.py | psql staging

# Run migration
./migrate up

# Check duration, locks, errors

Dry Runs

Some tools support dry runs:

1
2
3
4
5
# Flyway
flyway -dryRunOutput=dryrun.sql migrate

# Review before applying
cat dryrun.sql

Rollback Strategies

Reversible Migrations

Always write down migrations:

1
2
3
4
5
# Up
migrations.AddField('user', 'full_name', ...)

# Down
migrations.RemoveField('user', 'full_name')

Forward-Only

Some teams go forward-only — no rollbacks, only new migrations that fix issues:

VV56____afdidx__ssttaattuuss._stqylpe.sql##OFoipxs,itwrwointghtnyepwemigration

Safer than rollbacks that might lose data.

Feature Flags

Combine with feature flags for instant rollback:

1
2
3
4
if feature_enabled('use_new_schema'):
    return user.full_name
else:
    return user.name

Flip the flag if something’s wrong — no deployment needed.

Monitoring Migrations

Before Running

1
2
3
4
5
6
7
8
-- Check table size (large = slow migration)
SELECT pg_size_pretty(pg_total_relation_size('users'));

-- Check active connections
SELECT count(*) FROM pg_stat_activity WHERE state = 'active';

-- Estimate update time
EXPLAIN ANALYZE UPDATE users SET status = 'active' LIMIT 1000;

During Migration

1
2
3
4
5
6
7
8
-- Watch for locks
SELECT blocked_locks.pid AS blocked_pid,
       blocking_locks.pid AS blocking_pid,
       blocked_activity.query AS blocked_query
FROM pg_locks blocked_locks
JOIN pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype
JOIN pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
WHERE NOT blocked_locks.granted;

After Migration

  • Check error rates in application
  • Verify query performance (new indexes working?)
  • Monitor replication lag

Common Mistakes

1. Not Testing at Scale

A migration that takes 1 second on 1000 rows might take 1 hour on 10 million.

2. Running During Peak Hours

Schedule migrations during low-traffic windows. Even “safe” operations have overhead.

3. No Backout Plan

Know how to reverse before you start. Have the rollback migration ready.

4. Mixing Migration and Application Changes

Keep them separate. Migration first (or last), application separate.

5. Ignoring Replication Lag

In replicated setups, reads might hit replicas that haven’t caught up. Plan for temporary inconsistency.

The Bottom Line

  1. Never make breaking changes directly — Use expand-contract
  2. Use CONCURRENTLY for indexes — Avoid table locks
  3. Batch large updates — Don’t lock millions of rows
  4. Test at production scale — Small-data tests lie
  5. Deploy in order — Migration timing matters
  6. Monitor everything — Locks, errors, performance

Your database is your most valuable asset. Treat migrations with the respect they deserve.


Databases outlive applications. Migrate carefully, and your data will thank you.