Open Data Infrastructure for Analytics Engineers

Analytics engineering started as a better way to manage SQL. ODI turns it into a control point for business meaning.

The model is now part of the data contract

In a warehouse-centric world, analytics engineering often meant clean SQL models, tested transformations, and metrics that matched across dashboards. That work still matters. ODI changes the failure mode.

When the data platform becomes multi-engine and AI-facing, a model is not just a warehouse object. It is part of a contract that other engines, agents, notebooks, dashboards, and operational applications may depend on.

Core idea: analytics engineers are becoming the stewards of portable business meaning, not just maintainers of SQL folders.

Semantic definitions need infrastructure around them

A semantic layer helps define metrics, dimensions, entities, and relationships. That is good. It is not enough by itself. The semantic definition has to connect to the table contract underneath it, the policy around it, and the lineage that explains where the value came from.

If gross revenue means one thing in a dashboard and another thing in an agent workflow, the semantic layer has not solved meaning. It has moved the argument to a new YAML file.

Open tables change how analytics models age

Open table formats make models more portable, but only when analytics engineers respect the boundary. A model that depends on one warehouse-specific function, one private permission model, and one hidden scheduler behavior is not portable just because the output lands in Parquet.

The practical move is to separate durable model meaning from engine-specific execution. Use the engine features you need. Document the dependency when you do. Portability does not mean pretending engines are identical. It means knowing which assumptions are part of the contract.

Change management is the real skill

Analytics engineering lives in change. Columns get renamed. Metrics get redefined. Late data arrives. Backfills happen. Stakeholder requests become production dependencies before anyone admits it.

Tools like dbt Core and SQLMesh matter because they give teams structure for tests, documentation, environments, and promotion. ODI raises the bar: change management has to account for every consumer of the contract, not only the dashboard that triggered the work.

A portable analytics checklist

Before publishing a model into an open lakehouse, ask:

Can another approved engine read the output table without losing meaning?
Are metric definitions versioned and connected to source lineage?
Are access rules enforced below the dashboard layer?
Can late-arriving data and backfills be explained after the fact?
Can an AI system retrieve the value with source, freshness, and policy context?

That is analytics engineering after the warehouse stops being the only center of gravity.

Sources to start with

Start with transformation and semantic-layer docs, then connect them to open table and catalog contracts.

ODI hub Article library Use the scorecard dbt vs SQLMesh Context graph

Get started with Apache Iceberg, today! Want to learn more? Visit https://www.opendatainfrastructure.com/