Migration Testing
Schema migrations are one of the riskiest deployment operations. A bad migration can cause data loss, corruption, or extended downtime. Testing migrations before they reach production is essential — and often overlooked.
What Makes Migrations Risky
| Risk | Example | Impact |
|---|---|---|
| Data loss | Dropping a column that still has data | Irreversible data destruction |
| Data corruption | Type conversion truncates values | Silent data modification |
| Downtime | Adding an index on a large table without CONCURRENTLY |
Table locked for minutes |
| Failed rollback | Rollback script does not restore original schema | Cannot undo a bad migration |
| Default value bug | New NOT NULL column without default — existing rows fail | Application errors |
| Foreign key break | Adding FK to column with orphaned data | Migration fails |
Migration Testing Checklist
| Test | How |
|---|---|
| Migration applies cleanly | Run on a copy of production schema |
| Rollback works | Apply, roll back — schema returns to original state |
| Data is preserved | Insert known data before migration, verify after |
| Default values | New NOT NULL column must have a default — verify existing rows |
| Performance | Adding index on large table must not lock for minutes |
| Application compatibility | New schema works with both old and new application code |
Testing Migrations with Scripts
#!/bin/bash
set -euo pipefail
echo "=== Creating test database from production schema ==="
pg_dump --schema-only production_db > schema.sql
createdb migration_test && psql migration_test < schema.sql
echo "=== Loading seed data ==="
psql migration_test < test_seed_data.sql
echo "=== Recording pre-migration state ==="
psql migration_test -c "SELECT COUNT(*) FROM users" > pre_counts.txt
psql migration_test -c "\d users" > pre_schema.txt
echo "=== Applying migration ==="
psql migration_test < migrations/0042_add_phone_column.sql
echo "=== Verifying migration ==="
psql migration_test -c "\d users" # Verify new column exists
psql migration_test -c "SELECT COUNT(*) FROM users" # Verify no rows lost
psql migration_test -c "SELECT phone FROM users LIMIT 5" # Verify new column accessible
echo "=== Testing rollback ==="
psql migration_test < migrations/0042_rollback.sql
echo "=== Verifying rollback ==="
psql migration_test -c "\d users" # Verify column removed
psql migration_test -c "SELECT COUNT(*) FROM users" # Verify no rows lost
echo "=== Cleanup ==="
dropdb migration_test
echo "=== Migration test PASSED ==="
Testing Migrations in Python
import subprocess
import psycopg2
import pytest
@pytest.fixture
def migration_db():
"""Create a disposable database for migration testing."""
db_name = "migration_test_" + str(int(time.time()))
# Create database from production schema
subprocess.run(["createdb", db_name], check=True)
subprocess.run(
f"pg_dump --schema-only production_db | psql {db_name}",
shell=True, check=True
)
conn = psycopg2.connect(dbname=db_name)
yield conn, db_name
conn.close()
subprocess.run(["dropdb", db_name], check=True)
def test_migration_applies_cleanly(migration_db):
conn, db_name = migration_db
# Apply migration
subprocess.run(
f"psql {db_name} < migrations/0042_add_phone_column.sql",
shell=True, check=True
)
# Verify new column exists
cursor = conn.cursor()
cursor.execute("""
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'users' AND column_name = 'phone'
""")
row = cursor.fetchone()
assert row is not None, "Phone column was not created"
assert row[1] == "character varying" # Expected data type
def test_migration_preserves_data(migration_db):
conn, db_name = migration_db
cursor = conn.cursor()
# Insert known data before migration
cursor.execute("""
INSERT INTO users (name, email) VALUES
('Alice', 'alice@test.com'),
('Bob', 'bob@test.com')
""")
conn.commit()
# Apply migration
subprocess.run(
f"psql {db_name} < migrations/0042_add_phone_column.sql",
shell=True, check=True
)
# Verify data is preserved
conn = psycopg2.connect(dbname=db_name) # Reconnect
cursor = conn.cursor()
cursor.execute("SELECT name, email FROM users ORDER BY name")
rows = cursor.fetchall()
assert len(rows) == 2
assert rows[0][0] == "Alice"
assert rows[1][0] == "Bob"
def test_migration_rollback(migration_db):
conn, db_name = migration_db
# Apply migration
subprocess.run(
f"psql {db_name} < migrations/0042_add_phone_column.sql",
shell=True, check=True
)
# Rollback
subprocess.run(
f"psql {db_name} < migrations/0042_rollback.sql",
shell=True, check=True
)
# Verify column is removed
conn = psycopg2.connect(dbname=db_name)
cursor = conn.cursor()
cursor.execute("""
SELECT column_name FROM information_schema.columns
WHERE table_name = 'users' AND column_name = 'phone'
""")
assert cursor.fetchone() is None, "Phone column should not exist after rollback"
def test_not_null_column_has_default(migration_db):
conn, db_name = migration_db
cursor = conn.cursor()
# Insert data before migration
cursor.execute("INSERT INTO users (name, email) VALUES ('Test', 'test@test.com')")
conn.commit()
# Apply migration that adds NOT NULL column
subprocess.run(
f"psql {db_name} < migrations/0043_add_status_column.sql",
shell=True, check=True
)
# Verify existing rows have the default value
conn = psycopg2.connect(dbname=db_name)
cursor = conn.cursor()
cursor.execute("SELECT status FROM users WHERE email = 'test@test.com'")
status = cursor.fetchone()[0]
assert status is not None, "Existing rows should have default status value"
assert status == "active", "Default status should be 'active'"
Performance Considerations
Index Creation
Adding an index to a table with millions of rows can lock the table for minutes:
-- BAD: locks the table during index creation
CREATE INDEX idx_users_email ON users (email);
-- GOOD: creates the index without locking (PostgreSQL)
CREATE INDEX CONCURRENTLY idx_users_email ON users (email);
Testing Migration Performance
def test_migration_completes_in_time(migration_db):
"""Migration should complete within acceptable time."""
import time
conn, db_name = migration_db
# Load realistic data volume
cursor = conn.cursor()
cursor.execute("""
INSERT INTO users (name, email)
SELECT 'User ' || i, 'user' || i || '@test.com'
FROM generate_series(1, 100000) AS i
""")
conn.commit()
start = time.time()
subprocess.run(
f"psql {db_name} < migrations/0042_add_phone_column.sql",
shell=True, check=True
)
duration = time.time() - start
assert duration < 30, f"Migration took {duration:.1f}s (limit: 30s)"
Migration Testing in CI
# GitHub Actions migration testing
migration-test:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: testpass
ports:
- 5432:5432
steps:
- uses: actions/checkout@v4
- name: Load production schema
run: psql -h localhost -U postgres -f schema_dump.sql
- name: Run migration
run: psql -h localhost -U postgres -f migrations/latest.sql
- name: Verify migration
run: pytest tests/migrations/ -v
- name: Test rollback
run: psql -h localhost -U postgres -f migrations/latest_rollback.sql
Practical Exercise
- Take a migration file (or write one that adds a new column)
- Write a script that tests: applies cleanly, preserves data, rollback works
- Test what happens when the migration adds a NOT NULL column without a default to a table with existing data
- Measure migration time with 100,000 rows and set a performance threshold
- Add migration testing to a CI pipeline
Key Takeaways
- Migrations are high-risk operations — always test on a production schema copy
- Test the full lifecycle: apply, verify, rollback, verify again
- Verify data preservation: known data before migration must exist after
- NOT NULL columns need defaults — test with existing data
- Performance testing prevents long table locks in production
- Automate migration testing in CI to catch issues before deployment