0009 — TDD and stability as primary; the RLS isolation test is the keystone gate
- Status: accepted
- Date: 2026-05-06
- Deciders: Derek
Context
Ark is built by a one-engineer team whose day job is C++ low-level graphics — a discipline-heavy, correctness-paramount, deeply tested practice. The pitch (“one engineer + AI agents at N tenants”) collapses if anything breaks unpredictably.
Combined with multi-tenancy via row-partitioning (ADR 0002), correctness becomes existential: a single missing or wrong RLS policy leaks data across organizations. The mitigation isn’t “be careful” — careful doesn’t scale.
Decision
Test-driven development is the default, not optional. Every package follows: failing test first, implementation second, passing test third. New behavior without a test is a review-blocker.
The RLS isolation test (packages/db/test/rls-isolation.spec.ts) is the keystone gate. It boots a real Postgres (Supabase local), applies all migrations, creates two organizations with members, exercises every CRUD path and every published-content read path, and asserts that no query in either direction can see data from the other org.
CI runs the RLS isolation test on every push. A red gate blocks merge. There is no merge-with-failing-tests escape valve.
New tables and policies extend the RLS test as part of their PR. A migration that adds a table without a corresponding test row is rejected. The migration linter (pnpm migrate:lint) checks structurally; the test verifies behaviorally.
Consequences
Easier:
- A single engineer + AI agents can confidently refactor — the tests are the safety net
- Multi-tenancy correctness is observable and continuous, not a hopeful claim
- Onboarding a new feature follows a predictable path
- The test suite is the spec; documentation drift is bounded
Harder:
- Velocity is tighter; you can’t merge an idea, you have to merge a verified idea
- Setting up the RLS test harness (real Postgres in CI) is upfront cost — paid once, used forever
- Mocked tests are not allowed for the data layer (we know from internalize what mocked-DB tests cost)
Trip-wires
We reconsider this stance only if:
- A test-first practice provably blocks an urgent fix that has no other path forward (write the post-mortem; the answer is probably “we didn’t have the right test infrastructure,” not “TDD is wrong”)
- The RLS isolation test fails in CI more than twice in a quarter (this signals our policy-writing process is too error-prone — see ADR 0002 trip-wires)
Alternatives considered
- Test-after development with a test target. Looser; matches what most teams do; doesn’t give the same confidence at our scale of ambition (multi-tenant on shared infra). Not enough.
- Property-based testing over example-based. Genuinely better in some places (RLS isolation could use it). We adopt it where it fits, but the default is example-based for clarity.
- Mocked DB tests for speed. Rejected. Internalize already documented what mocked-Supabase tests cost. The DB is shared infrastructure; we test against the real thing.