Open Data Infrastructure
Data Modeling for Event-Sourced Agent Workflows
How events, commands, state transitions, idempotency, ownership, and audit trails shape recoverable agent workflow models.
Agent workflows fail in boring ways: duplicate actions, lost state, unclear ownership, and no record of why the next step happened.
Agents need recoverable state
Event-sourced design records changes as events rather than only storing the latest state. CloudEvents provides a common event data format, and OpenTelemetry traces help connect operations across a request path. For agent workflows, those patterns help because the workflow is not a single function call. It is a sequence of observations, tool calls, decisions, and state changes.
Data modeling has to capture more than entities. It needs commands, events, state transitions, idempotency keys, owners, permissions, retries, and correction paths.
Model the workflow as evidence
A recoverable agent workflow needs a clear event model: requested, planned, tool_called, data_accessed, policy_denied, human_review_requested, action_committed, corrected, and closed. The exact names will vary. The discipline should not.
Idempotency matters because agents retry. Ownership matters because someone has to approve or repair the workflow. Audit trails matter because agent actions become part of business history.
Core idea: Agent workflow data models should treat every action as an event with ownership, policy, and replay context.
The ODI workflow pattern
Open Data Infrastructure gives these events something to point at. A data access event can reference a data product. A policy denial can reference the catalog rule. A correction event can reference the lineage path and evaluation result.
For adjacent context, read data modeling for multi-agent workflows, agent write paths and human review, and agentic replay logs.
What breaks first
- The workflow stores final state but not the decisions that produced it.
- Retries create duplicate actions because idempotency is not modeled.
- Human review is a comment thread instead of a state transition.
- Corrections update the final answer without preserving the original event path.
Questions to ask
Ask which events define the workflow, which fields make events replayable, and which identifiers connect events to data products and policies. Ask how retries, denials, and human approvals appear in the model.
If the workflow matters, the event history is the product.
Sources to start with
These primary sources anchor the technical claims in this guide.
- CloudEvents specification overview
- OpenTelemetry traces documentation
- OpenLineage object model documentation
- W3C PROV overview
Agent state without events is memory without accountability.