From Semantic Layer to Context Graph

Agents do not fail because they cannot query data. They fail because they cannot tell which data is allowed, current, trustworthy, and relevant.

Why semantic layers are not enough for agents

Semantic layers were built to help humans. They define metrics, dimensions, and business-friendly names so analysts can answer questions without memorizing physical schemas. That was the right abstraction for BI.

Agents change the problem. Agents need machine-actionable context: ownership, permissions, lineage, freshness, and the relationships between entities across domains. A metric definition alone does not tell an agent whether it is safe to act on the result.

Core idea: a semantic layer makes data understandable. A context graph makes data governable for machines.

What semantic layers solved

Semantic layers typically focus on:

metric definitions and aggregation rules
business-friendly naming and documentation
join paths and dimensional modeling conventions
consistency across dashboards and reports

That is valuable. It reduces dashboard chaos and makes analytics repeatable. It also tends to assume a bounded audience and a bounded set of use cases.

What a context graph adds

A context graph extends the semantic idea in three directions:

relationships across domains: entities, identifiers, and mappings that let a system navigate the business graph
governance and policy: machine-readable rules about who can see what and under which conditions
provenance: lineage, freshness, and quality signals attached to the answers an agent produces

Think of a context graph as "semantic layer plus the things that make it safe to automate." Without governance and provenance, automation becomes a liability.

A practical architecture view

A practical ODI-aligned context graph architecture usually includes:

open storage and tables: the durable data contract (table formats and catalogs)
metadata and lineage: operational events captured from execution, not reconstructed after incidents
entity graph: a graph view that maps identifiers and relationships across systems
policy engine: enforcement in the data path and the agent tool layer
retrieval layer: a governed interface that returns data plus the provenance that makes it trustworthy

This fits naturally with the ODI idea that governance should be infrastructure behavior. Start with ODI for AI Agents and ODI Foundation for AI.

Governance and audit requirements

If an agent can answer a question, it can also leak a secret. That is why context graphs have to include policy and audit as first-class concerns.

Practical requirements:

every retrieved fact carries source and lineage metadata
permissions are enforced consistently across tools and engines
audit logs are queryable and retained long enough to be useful
human review is possible for high-risk actions

OpenLineage is one concrete standard for capturing execution lineage signals across systems. OpenLineage documentation.

How to start without boiling the ocean

Start with the narrowest graph that still changes your automation posture:

pick one critical domain (customers, products, claims, orders, or accounts)
standardize identifiers and mappings, and document them as a contract
attach provenance (freshness and lineage) to key datasets
enforce policy where data is retrieved and where actions are taken

When agents can ask questions safely in one domain, expansion becomes a product decision, not a science project.

Sources to start with

Start with ODI's governance and AI safety posture, then ground lineage and policy in real standards.

ODI hub Article library Use the scorecard ODI for AI agents Agentic data

Get started with Apache Iceberg, today! Want to learn more? Visit https://www.opendatainfrastructure.com/