ODI Reference Architecture for Agentic Analytics

Agentic analytics needs a reference architecture because "connect the model to the warehouse" is not an architecture.

The architecture starts below the agent

Agentic analytics has at least seven layers: open tables, catalogs, governance, lineage, semantic context, query or retrieval APIs, and operational review. The model is important, but it is not the layer that owns data truth.

Open tables give data a portable storage and metadata boundary. Catalogs coordinate table discovery, identity, ownership, and operations. Governance defines policy behavior. Lineage explains where data came from. Context layers package meaning for agents. APIs expose controlled work. Review closes the loop.

Contracts connect the layers

The weak version of this architecture draws boxes and arrows. The useful version defines contracts between layers. Table contracts define schema and snapshots. Catalog contracts define ownership and access. Tool contracts define inputs and outputs. Evaluation contracts define expected answer behavior.

Those contracts should be testable. An agentic analytics stack that cannot test access, freshness, lineage, tool output, and recovery behavior is not ready for production decisions.

Core idea: agentic analytics is governed infrastructure with a model interface, not a model feature with a data connection.

The ODI pattern keeps agents from owning the data layer

Open Data Infrastructure keeps control with data owners while still giving agents useful interfaces. That is the balance. Agents need rich context, but they should not become the only place where policy, semantics, and lineage are understood.

For the supporting pieces, read the ODI reference architecture, MCP and ODI, and tool schemas as data contracts. Agentic analytics is where those patterns become one system.

What breaks first

Agentic analytics usually fails at the boundary between a clever interface and an ungoverned data path.

The agent can answer questions but cannot cite table snapshots or lineage.
Tool schemas validate fields but not policy assumptions.
Retrieval context is fresh enough for demos and stale enough for incidents.
Operational review sees model traces but not catalog, query, and data product evidence.

Questions to ask during architecture review

Ask which layer owns table truth, which layer owns policy, which layer owns semantic context, which layer exposes tools, and which layer records review evidence. Ask what changes when you swap the model, query engine, catalog, or retrieval index.

A reference architecture is doing its job when a component can change without destroying trust in the data.

Sources to start with

These primary sources anchor the technical claims in this guide.

Agentic analytics works when the agent is powerful and the data layer still owns the truth.

ODI hub Article library Use the scorecard ODI architecture MCP and ODI Tool contracts

Get started with Apache Iceberg, today! Want to learn more? Visit https://www.opendatainfrastructure.com/

Open Data Infrastructure Reference Architecture for Agentic Analytics