Open Data Infrastructure
dbt and SQLMesh on the Open Lakehouse
Transformation frameworks do not disappear in an open lakehouse. They just stop being a warehouse feature and become part of the contract layer.
The open lakehouse pitch is simple: put data in open tables and keep engines interchangeable. The part teams forget is that transformation workflows are where lock-in is usually enforced.
The shift: warehouse workflows to open tables
In a warehouse-centric stack, transformation is often coupled to the warehouse itself: table materializations, incremental strategies, quality checks, and lineage all assume one execution environment.
In an open lakehouse, the tables are the contract. That changes the question from "which warehouse feature do we use" to "which contract do we guarantee across engines, catalogs, and compute choices."
Core idea: open tables are not a strategy if your transformation layer is still a single-engine dependency.
What you need to make transformation portable
If you want dbt or SQLMesh to stay portable, you need a few infrastructure guarantees that are easy to gloss over.
- A stable catalog surface: the transformation tool needs a durable way to resolve tables, schemas, and environments.
- A table contract that survives engine change: schema evolution, partition evolution, and snapshot semantics must be readable across engines.
- Operational lineage and audit: not only for dashboards, but for machine decisions and incident response.
- Release control: staging and promotion mechanisms that do not depend on one vendor's job system.
This is why ODI keeps returning to the same boring components: open table formats, open catalogs, and enforceable governance at the boundary.
How dbt fits
dbt is an opinionated workflow for building transformation models and a semantic layer of documentation and tests. In an open lakehouse, dbt can still be useful, but the assumptions matter.
- dbt is strongest when your execution environment is consistent and your materialization behavior is predictable.
- dbt adds value through project structure, tests, and documentation, even when the target tables live in an open format.
- dbt becomes a portability risk when macros and adapter-specific behavior become the real logic of the system.
How SQLMesh fits
SQLMesh approaches the problem as a versioned, environment-aware transformation system. The core ODI question is whether that versioning and environment model stays portable when your compute changes.
SQLMesh is especially relevant when you care about controlled publishes, backfills that do not break consumers, and explicit environments. Those are release engineering concerns, and open data infrastructure needs that discipline if it wants portability without chaos.
Failure modes and tradeoffs
The failure modes are predictable, which is good news. You can design around them.
- Transformation logic leaks into engine-specific SQL: portability becomes theoretical, not real.
- Metadata is partial: tests, lineage, and ownership exist in the tool, but not in the shared catalog surface.
- Release semantics are implicit: teams "publish" by running a job, not by promoting a governed artifact.
If you want an open lakehouse, treat transformation as part of the contract layer. Do not treat it as a set of warehouse jobs with better documentation.
Sources to start with
Start with the tool docs, then anchor behavior to the open table and catalog specs you plan to standardize on.