Open Data Infrastructure
What Is Agentic Data?
Agentic data is data designed for systems that act, not just systems that report. It emphasizes permissions, provenance, and action traces as part of the data contract.
Dashboards tolerate ambiguity. Agents do not. If the system is going to act, the data contract has to get sharper.
A definition that is useful
Agentic data is data packaged with the metadata and controls required for an automated system to use it safely. It is not a new storage format. It is a discipline: you treat permissions, provenance, and action traces as first-class parts of the dataset.
Core idea: agentic data is not only "data for AI." It is data that stays governable when the consumer is a machine that can take actions.
Why agentic systems change the data contract
Traditional analytics assumes a human in the loop. If a metric looks wrong, a human questions it. If access looks suspicious, a human reports it. If the query is ambiguous, a human clarifies intent.
Agents collapse that buffer. An agent can query, decide, and act in one loop. That makes three things non-negotiable:
- permissions: the agent must only see what it is allowed to see
- provenance: the agent must be able to retrieve source, freshness, and lineage signals with the data
- auditability: humans must be able to reconstruct why the agent got a piece of data and what it did with it
This is why ODI treats governance as infrastructure behavior, not as a process document.
Properties of agentic data
Agentic data usually has these properties:
- machine-actionable metadata: owners, domains, and meanings that a system can use programmatically
- policy-carrying access: authorization that is evaluated in the data path and the tool layer
- retrieval with provenance: results include lineage, freshness, and quality signals
- action traces: every agent action is logged with inputs, retrieved context, and outputs
- contract-driven datasets: schemas and identifiers treated as stable interfaces, not suggestions
If you have a semantic layer today, this is the next step. See From Semantic Layer to Context Graph.
Where it fits in the ODI stack
Agentic data sits at the intersection of:
- open tables and catalogs: durable storage and interoperability contracts
- governance: consistent enforcement and audit
- context layer: retrieval that is governed and explainable
If your agent stack can bypass the data platform controls, you do not have agentic data. You have a security incident waiting for a prompt.
For the ODI view, start with ODI for AI Agents and ODI Glossary.
Common anti-patterns
- RAG without governance: retrieval that returns text and facts without policy or provenance
- context without freshness: cached answers that nobody can date or validate
- permissions checked only in the app: access control that disappears when a tool connects directly to data
- no action traces: automation without reconstructable audit logs
Those anti-patterns are why AI makes ODI more important, not less.
A checklist for teams building agents
- Define the domains and datasets agents are allowed to access.
- Enforce access controls in the data path, not only in the UI.
- Attach provenance (freshness, lineage, owner) to retrieved context.
- Log agent actions with inputs, retrieved context, and outputs.
- Build human review and rollback for high-risk actions.
Agents become safe when audit becomes normal.
Sources to start with
Start with governance and provenance standards, then map them into your architecture.