WritingDatabricks (DBRX)Databricks (DBRX)published Jun 12, 2026seen 19h

Enabling Evolutionary Database Development: Database branching with Lakebase, the conclusion

Open original ↗

Captured source

source ↗

Enabling Evolutionary Database Development: Database branching with Lakebase, the conclusion | Databricks Blog Skip to main content

The methodology described in  Evolutionary Database Design and operationalized in  Refactoring Databases: Evolutionary Database Design has been clear for twenty years. The seven practices, the catalog of 70+ named refactorings, the transition mechanics – all of it documented, peer-reviewed, taught. That methodology reached CI/CD in 2010 with  Continuous Delivery (Chapter 12: Managing Data). Migrations became first-class artifacts in the deployment pipeline. The discipline of database-changes-as-code reached the broader CI/CD movement. What CD didn't solve was per-pipeline isolation: pipelines could run migrations, but they still needed a target database, and that target was shared. Practice #4 –  Everybody gets their own database instance – has stayed aspirational on most teams because true per-developer production-shaped databases cost time, money, and DBA cycles. The compensating layer that emerged to work around the gap (mock objects, shared staging environments, in-memory database substitutes, DBA ticket queues) became foundational methodology by default, not by design. In 2026, copy-on-write database branching arrives in  Databricks Lakebase . A one-second, zero-storage-at-creation branch of a terabyte-scale production database is now an O(1) operation. The constraint that kept Practice #4 aspirational has lifted. This series describes what changes when the constraint lifts: not the methodology – that holds – but the practices that emerge for the first time, the team-scale governance that becomes automatic, the role evolution for the DBA, and the new substrate that agents share with their human counterparts. Meet Jen Jen is the developer character from  Evolutionary Database Design . In that essay she implemented a database refactoring – splitting an  inventory_code field into location_code , batch_number , and  serial_number – as a routine user story, illustrating that DBAs and developers can collaborate, schemas can evolve in small increments, and migrations carry the change forward safely. The series picks up with Jen twenty years later. The methodology she follows is the same one she followed in 2003. What's new is the substrate underneath her workflow: copy-on-write database branching, which makes the practices she has been reading about operationally real at production scale. Across the three parts of this series she is the same Jen at three scopes – her day ( Part 1 ), her new playbook ( Part 2 ), and her team (Part 3). Part 3: Jen's Team at Scale Part 1 walked Jen through one feature.  Part 2 named the eleven-practice playbook her work follows in 2026. Part 3 takes the same playbook to a team of fifty developers, with agents creating branches alongside humans, and asks: what becomes structural at this scale? Three things become load-bearing. First, the tier topology, the long-running branches that represent each environment in the promotion path. At one developer, you had a feature branch and production. At fifty, you have a structured hierarchy with stable lanes and ephemeral lanes layered on top. Second, the permission model, the framework that says who can do what to which branch. At one developer, you could trust convention. At fifty, with agents in the mix, the framework has to be designed once and enforced automatically. Third, the role of the DBA. At one developer, the DBA was Jen's design partner on the PR. At fifty, the DBA is the platform engineer who designed the framework Jen and her teammates are operating inside. This post covers each of those, then turns to the agents. Agents on the same capability is Practice #11. Agents are like junior developers: they produce code that runs, tests that pass, migrations that apply, and, without guidance, unmaintainable systems. Tests are how the team keeps them honest. The TDD playbook that comes next is how the team makes the tests come first. Tiers as long-running branches, not separate instances In the world before branching, an environment was an instance: a dedicated Postgres deployment for staging, another for UAT, another for performance testing, each provisioned, patched, masked, and synced separately. The compensating layer Part 2 named lived here too. Drift between environments was structural. At team scale, the tier model collapses into long-running branches off the same Lakebase parent. A branch is one of two things: a  tier (long-living, a parent in the promotion hierarchy) or a  feature (ephemeral, descends from a tier and gets cleaned up). A tier has a parent. The parent-of chain is the promotion hierarchy.

Fig 1: A simple layout of Main line and its branches

In Fig 1: we see a simple hierarchy, with the main being the production and Feature branches are taken whenever needed, this setup generally is useful for early prototyping or early stage work with a really small team. In mature teams with more developers and/or lots of environments needing a setup as shown below.

Fig 2: A layout with main line consisting of latest schema and reference data and all its various branches

In some enterprises, there is need to have a release candidate(RC) and this release candidate is under development for sometime and after successful testing it is promoted to production, Fig 3: shows a layout that allows for release candidates to be developed and later promoted to production, the release candidate branch that then be cleaned up,

Fig 3: A layout using release candidate for development and testing

The names of the branches are arbitrary, what matters is the conventions on how the parent-of conventions are set up. A policy that does not let transitions that contradict the parent chain hierarchy can be implemented to prevent a direct feature merge. The policy definitions enable many benefits for pipeline management: One pipeline definition, branch-aware. The  pr.yml introduced in  Part 2 runs against every PR; the  merge.yml runs against every promotion. The same workflow covers features, tiers, and the transitions between them. Promotion is merge, not redeploy. Shipping from staging to production is a git merge whose downstream effect is a Lakebase branch promotion. The migration applies once at each tier, validated at the prior tier first just like how code that is validated in earlier stages. No drift between "the test environment" and production. Every tier...

Excerpt shown — open the source for the full document.

Notability

notability 5.0/10

Substantive blog post on database branching, but not a major launch.