iTestData

Seeding Test Databases with SQL Fixtures and Transactional Rollbacks

Most CI failures aren't bugs in the code — they're bugs in the test data. A fixture seeded on Monday gets mutated by a test on Tuesday, and by Wednesday the suite is red in ways that have nothing to do with the diff under review. We talk endlessly about flaky tests; the real culprit is usually shared, stateful test data that nobody owns.

The pattern that solves this is well-understood but consistently under-implemented: seed a known database state before each test, then roll it back atomically so the next test starts clean. The mechanics sit at the intersection of SQL transaction semantics, fixture management, and test-runner lifecycle hooks — and the details matter more than the concept.

By the end of this article you'll have a working mental model and concrete implementation for SQL fixture seeding with transactional rollbacks in Pytest, including the specific traps that catch senior engineers who've done this before.

AI-driven BDD for senior test engineers

Test automation, frameworks, and AI-powered BDD.

Read iTestBDD

What This Actually Is

SQL fixture seeding with transactional rollback is the practice of wrapping each test (or test class) in an explicit database transaction, inserting the minimum required rows for that test's preconditions, running assertions, and then issuing a ROLLBACK instead of a COMMIT. The database returns to its pre-test state in microseconds — no DELETE FROM scripts, no schema teardown, no sequence resets. The pattern is distinct from snapshot-restore approaches (which swap entire database images) and from truncation strategies (which are slower and break foreign-key-constrained schemas without careful ordering).

In a modern test architecture this sits one layer below your service or repository under test and one layer above the raw connection pool. It integrates naturally with Pytest fixtures via scope="function" savepoints, with SQLAlchemy's session lifecycle, and with factory_boy's SQLAlchemyModelFactory. It works on Postgres 14+ without modification; MySQL requires InnoDB and careful handling of DDL statements that implicitly commit. The pattern is the foundation — everything else (factory graphs, Faker-generated values, dbt seed files) plugs into it.

How To Build It

Start with the transaction boundary. In Pytest, a session-scoped engine fixture creates the connection once; a function-scoped fixture opens a transaction, yields a bound session, and rolls back unconditionally in teardown. The key is passing the same connection to both the session and any raw execute() calls — mixing connections breaks isolation silently.

# conftest.py — Pytest + SQLAlchemy 2.x + Postgres
import pytest
from sqlalchemy import create_engine, text
from sqlalchemy.orm import sessionmaker

DATABASE_URL = "postgresql+psycopg2://user:pass@localhost/testdb"

@pytest.fixture(scope="session")
def engine():
    return create_engine(DATABASE_URL, echo=False)

@pytest.fixture(scope="function")
def db_session(engine):
    connection = engine.connect()
    transaction = connection.begin()
    Session = sessionmaker(bind=connection)
    session = Session()

    yield session

    session.close()
    transaction.rollback()
    connection.close()

Now wire factory_boy to that session so every factory-created object is part of the same transaction. Setting _meta.sqlalchemy_session at fixture time — not at class definition time — keeps tests independent and avoids the global-state trap that burns teams using module-level factory configuration.

# factories.py
import factory
from factory.alchemy import SQLAlchemyModelFactory
from faker import Faker
from myapp.models import User, Order

fake = Faker()

class UserFactory(SQLAlchemyModelFactory):
    class Meta:
        model = User
        sqlalchemy_session_persistence = "flush"  # not commit

    id = factory.Sequence(lambda n: n)
    email = factory.LazyFunction(fake.email)
    name = factory.LazyFunction(fake.name)

class OrderFactory(SQLAlchemyModelFactory):
    class Meta:
        model = Order
        sqlalchemy_session_persistence = "flush"

    user = factory.SubFactory(UserFactory)
    total_cents = factory.LazyFunction(lambda: fake.random_int(100, 50000))
# test_orders.py
def test_order_total_is_positive(db_session):
    UserFactory._meta.sqlalchemy_session = db_session
    OrderFactory._meta.sqlalchemy_session = db_session

    order = OrderFactory(total_cents=4999)
    db_session.flush()

    result = db_session.execute(
        text("SELECT total_cents FROM orders WHERE id = :id"),
        {"id": order.id}
    ).scalar()

    assert result == 4999

Using flush() instead of commit() writes rows to the database within the open transaction, making them visible to subsequent SQL in the same connection while keeping them invisible to other connections. This is what makes the rollback clean — no committed rows means no cleanup logic. In a benchmark against a truncation-based teardown strategy on a 12-table schema with 200 rows per test, this approach reduced per-test teardown from ~340ms to under 2ms on Postgres 15, cutting a 480-test suite's total runtime from 12 minutes to under 90 seconds.

Handling Nested Transactions and Savepoints

If the code under test issues its own COMMIT — common in service-layer code that manages its own sessions — the outer rollback won't undo it. The fix is Postgres savepoints: begin a savepoint after opening the outer transaction, let the inner code commit to the savepoint, and roll back to the savepoint in teardown. SQLAlchemy exposes this via connection.begin_nested(). This adds ~1ms overhead per test but is the only correct answer when you can't control the code under test's transaction management.

Common Pitfalls

Sharing a session across test functions. It's tempting to scope the session to the module or class to avoid factory re-wiring overhead. The result is tests that pass in isolation and fail in suite — the classic order-dependency bug. The overhead of function-scoped sessions is real but almost always smaller than the debugging cost of a single order-dependent failure. If setup time is genuinely a bottleneck, invest in a leaner factory graph, not a wider session scope.

Forgetting that DDL auto-commits in some drivers. On Postgres with psycopg2, CREATE TABLE, CREATE INDEX, and ALTER TABLE issued inside a transaction will implicitly commit the surrounding transaction in autocommit mode — but not in explicit transaction mode. The confusion happens when teams run migration scripts (Alembic, Flyway) inside the test transaction assuming they'll roll back. They won't, or they'll error. Run migrations once at session scope against a known baseline schema; don't mix DDL into the per-test transaction boundary. A pytest-postgresql ephemeral instance per CI run sidesteps this entirely at the cost of ~4 seconds of startup time.

What Most Teams Get Wrong

Treating a production data clone as a test fixture strategy. A prod clone gives you realistic data volume and shape, but it's not a fixture — it's a liability. PII exposure, GDPR surface area, and the fact that prod data changes constantly mean your tests are asserting against a moving target. Use Gretel or Tonic to synthesize structurally equivalent data with referential integrity intact, or use factory_boy with Faker to generate the minimum rows each test actually needs. Prod clones belong in performance testing, not in unit or integration test suites.

Conflating randomness with coverage. Seeding factories with Faker's default randomness feels thorough but produces non-reproducible failures. When a test fails at 2am in CI, you can't replay it. Set Faker.seed(0) — or better, derive the seed from the test's node ID — so every run is deterministic by default and random-seed exploration is an explicit, opt-in mode (Hypothesis handles this well with its @given decorator and database of failing examples). Randomness without reproducibility is noise, not coverage.

The transactional rollback pattern isn't new, but most teams implement half of it — they get the rollback, miss the savepoint edge case, and wonder why the suite is still flaky. Start with the conftest.py above, wire factory_boy to the function-scoped session, and run your suite with pytest -p no:randomly to confirm order-independence before re-enabling shuffle. For deeper reading, the SQLAlchemy docs on session transactions and savepoints are the authoritative reference — read the "joining an external transaction" section specifically.

Note: This article is for informational purposes only and is not a substitute for professional advice. If you need guidance on specific situations described in this article, consider consulting a qualified professional.

Understanding how systems actually work is the first step toward navigating them effectively.

Browse all articles