Open Data Infrastructure
From Semantic Layer to Context Graph
Semantic layers made BI usable. Context graphs make agents safe. The difference is not vocabulary. It is governance, lineage, and machine-actionable meaning.
Agents do not fail because they cannot query data. They fail because they cannot tell which data is allowed, current, trustworthy, and relevant.
Why semantic layers are not enough for agents
Semantic layers were built to help humans. They define metrics, dimensions, and business-friendly names so analysts can answer questions without memorizing physical schemas. That was the right abstraction for BI.
Agents change the problem. Agents need machine-actionable context: ownership, permissions, lineage, freshness, and the relationships between entities across domains. A metric definition alone does not tell an agent whether it is safe to act on the result.
Core idea: a semantic layer makes data understandable. A context graph makes data governable for machines.
What semantic layers solved
Semantic layers typically focus on:
- metric definitions and aggregation rules
- business-friendly naming and documentation
- join paths and dimensional modeling conventions
- consistency across dashboards and reports
That is valuable. It reduces dashboard chaos and makes analytics repeatable. It also tends to assume a bounded audience and a bounded set of use cases.
What a context graph adds
A context graph extends the semantic idea in three directions:
- relationships across domains: entities, identifiers, and mappings that let a system navigate the business graph
- governance and policy: machine-readable rules about who can see what and under which conditions
- provenance: lineage, freshness, and quality signals attached to the answers an agent produces
Think of a context graph as "semantic layer plus the things that make it safe to automate." Without governance and provenance, automation becomes a liability.
A practical architecture view
A practical ODI-aligned context graph architecture usually includes:
- open storage and tables: the durable data contract (table formats and catalogs)
- metadata and lineage: operational events captured from execution, not reconstructed after incidents
- entity graph: a graph view that maps identifiers and relationships across systems
- policy engine: enforcement in the data path and the agent tool layer
- retrieval layer: a governed interface that returns data plus the provenance that makes it trustworthy
This fits naturally with the ODI idea that governance should be infrastructure behavior. Start with ODI for AI Agents and ODI Foundation for AI.
Governance and audit requirements
If an agent can answer a question, it can also leak a secret. That is why context graphs have to include policy and audit as first-class concerns.
Practical requirements:
- every retrieved fact carries source and lineage metadata
- permissions are enforced consistently across tools and engines
- audit logs are queryable and retained long enough to be useful
- human review is possible for high-risk actions
OpenLineage is one concrete standard for capturing execution lineage signals across systems. OpenLineage documentation.
How to start without boiling the ocean
Start with the narrowest graph that still changes your automation posture:
- pick one critical domain (customers, products, claims, orders, or accounts)
- standardize identifiers and mappings, and document them as a contract
- attach provenance (freshness and lineage) to key datasets
- enforce policy where data is retrieved and where actions are taken
When agents can ask questions safely in one domain, expansion becomes a product decision, not a science project.
Sources to start with
Start with ODI's governance and AI safety posture, then ground lineage and policy in real standards.