Your schema will change. The question is whether you’ll manage those changes or let them manage you.

Why Migrations Matter

Raw SQL scripts don’t track what’s been run. Manual changes don’t scale. And “just drop and recreate” doesn’t work when you have real data.

Migrations solve this:

  • Version control for your database
  • Reproducible changes across environments
  • Rollback capability when things go wrong
  • Team coordination on schema changes

The Tools

Alembic (Python/SQLAlchemy)

1
2
pip install alembic
alembic init migrations

Configure alembic.ini:

1
2
3
[alembic]
script_location = migrations
sqlalchemy.url = postgresql://localhost/myapp

Create a migration:

1
alembic revision -m "add users table"

The migration file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# migrations/versions/001_add_users_table.py
from alembic import op
import sqlalchemy as sa

revision = '001'
down_revision = None

def upgrade():
    op.create_table(
        'users',
        sa.Column('id', sa.Integer(), primary_key=True),
        sa.Column('email', sa.String(255), nullable=False, unique=True),
        sa.Column('name', sa.String(255)),
        sa.Column('created_at', sa.DateTime(), server_default=sa.func.now()),
    )
    op.create_index('ix_users_email', 'users', ['email'])

def downgrade():
    op.drop_index('ix_users_email')
    op.drop_table('users')

Run migrations:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Apply all pending
alembic upgrade head

# Apply specific version
alembic upgrade 001

# Rollback one step
alembic downgrade -1

# See current state
alembic current

Auto-generating Migrations

If you use SQLAlchemy models:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# models.py
from sqlalchemy import Column, Integer, String, DateTime
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class User(Base):
    __tablename__ = 'users'
    
    id = Column(Integer, primary_key=True)
    email = Column(String(255), nullable=False, unique=True)
    name = Column(String(255))
    created_at = Column(DateTime, server_default='now()')

Generate migration from model changes:

1
alembic revision --autogenerate -m "add users table"

Alembic compares your models to the database and generates the diff.

Flyway (Java/JVM)

1
2
3
4
5
6
7
8
9
-- V1__create_users.sql
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255) NOT NULL UNIQUE,
    name VARCHAR(255),
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX ix_users_email ON users(email);
1
2
3
flyway migrate
flyway info
flyway repair

golang-migrate

1
2
3
4
5
migrate create -ext sql -dir migrations add_users

# Creates:
# migrations/000001_add_users.up.sql
# migrations/000001_add_users.down.sql
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
-- 000001_add_users.up.sql
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255) NOT NULL UNIQUE,
    name VARCHAR(255),
    created_at TIMESTAMP DEFAULT NOW()
);

-- 000001_add_users.down.sql
DROP TABLE users;
1
2
migrate -database "postgres://localhost/myapp" -path migrations up
migrate -database "postgres://localhost/myapp" -path migrations down 1

Safe Migration Patterns

Adding Columns

Safe — nullable columns:

1
2
def upgrade():
    op.add_column('users', sa.Column('phone', sa.String(20), nullable=True))

Risky — non-nullable without default:

1
2
3
# This will fail if table has existing rows!
def upgrade():
    op.add_column('users', sa.Column('phone', sa.String(20), nullable=False))

Safe — non-nullable with default:

1
2
3
def upgrade():
    op.add_column('users', sa.Column('status', sa.String(20), 
                                      nullable=False, server_default='active'))

Renaming Columns

The dangerous way:

1
2
3
# Breaks existing queries immediately
def upgrade():
    op.alter_column('users', 'name', new_column_name='full_name')

The safe way (expand-contract pattern):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Migration 1: Add new column
def upgrade():
    op.add_column('users', sa.Column('full_name', sa.String(255)))
    op.execute("UPDATE users SET full_name = name")

# Deploy code that writes to BOTH columns
# Deploy code that reads from new column

# Migration 2: Drop old column (after code is deployed)
def upgrade():
    op.drop_column('users', 'name')

Adding Indexes

For small tables:

1
2
def upgrade():
    op.create_index('ix_users_email', 'users', ['email'])

For large tables (PostgreSQL):

1
2
3
def upgrade():
    # CONCURRENTLY doesn't lock the table
    op.execute("CREATE INDEX CONCURRENTLY ix_users_email ON users(email)")

Note: CREATE INDEX CONCURRENTLY can’t run inside a transaction.

Dropping Tables

Always back up first:

1
2
3
4
5
6
def upgrade():
    # Rename instead of drop for safety
    op.rename_table('old_users', 'old_users_backup_20260211')

def downgrade():
    op.rename_table('old_users_backup_20260211', 'old_users')

Zero-Downtime Migrations

The goal: deploy schema changes without taking your app offline.

The Expand-Contract Pattern

Phase 1: Expand

  • Add new columns/tables
  • Keep old structure intact
  • Deploy code that writes to both

Phase 2: Migrate Data

  • Backfill new columns
  • Verify data integrity

Phase 3: Contract

  • Deploy code that only uses new structure
  • Remove old columns/tables

Example: Splitting a name field

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Current: users.name = "John Doe"
# Target: users.first_name, users.last_name

# Migration 1: Add new columns
def upgrade():
    op.add_column('users', sa.Column('first_name', sa.String(127)))
    op.add_column('users', sa.Column('last_name', sa.String(127)))

# Deploy code that writes to all three

# Migration 2: Backfill
def upgrade():
    op.execute("""
        UPDATE users 
        SET first_name = split_part(name, ' ', 1),
            last_name = split_part(name, ' ', 2)
        WHERE first_name IS NULL
    """)

# Deploy code that reads from new columns

# Migration 3: Remove old column
def upgrade():
    op.drop_column('users', 'name')

Migration Safety Checklist

Before running in production:

  • Tested in staging with production-like data
  • Rollback script tested
  • Backup taken
  • Estimated run time known
  • Lock time acceptable
  • Code compatible with both old and new schema

Handling Failures

Wrap migrations in transactions:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# alembic/env.py
def run_migrations_online():
    with connectable.connect() as connection:
        context.configure(
            connection=connection,
            target_metadata=target_metadata,
            transaction_per_migration=True,  # Each migration in its own transaction
        )
        with context.begin_transaction():
            context.run_migrations()

Manual rollback procedure:

1
2
3
4
5
6
7
8
9
# 1. Identify current version
alembic current

# 2. Rollback to last known good
alembic downgrade abc123

# 3. Fix the migration
# 4. Re-run
alembic upgrade head

CI/CD Integration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# .github/workflows/deploy.yml
jobs:
  migrate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Run migrations
        env:
          DATABASE_URL: ${{ secrets.DATABASE_URL }}
        run: |
          pip install alembic psycopg2-binary
          alembic upgrade head
      
      - name: Deploy application
        run: ./deploy.sh

Always migrate before deploying new code — your new code expects the new schema.

The Golden Rules

  1. Never edit a migration that’s been applied — create a new one
  2. Always write downgrade functions — you’ll need them
  3. Test migrations with real data volumes — 10 rows ≠ 10 million rows
  4. One logical change per migration — easier to debug and rollback
  5. Review migrations in PRs — schema changes deserve scrutiny

Your database is the foundation. Treat migrations with the respect they deserve.