Context Graphs for Data Access Decisions

Access control answers "can this identity do this thing?" A context graph helps answer the question people actually care about: should this action happen in this situation?

The practical problem

Modern data access decisions involve more than a user, a role, and a table. They involve purpose, domain ownership, data sensitivity, lineage, freshness, geography, consent, downstream action, and sometimes an AI agent acting on behalf of someone else.

A context graph connects those facts so access decisions can use infrastructure context instead of scattered tribal knowledge. It does not replace policy engines or catalogs. It gives them better facts to work with.

Core idea: context graphs make access decisions explainable by connecting identity, purpose, policy, lineage, and data meaning.

What belongs in the graph

Start with identities and delegation. A human analyst, scheduled job, data product API, and agent should not all collapse into one generic service account. The graph needs to preserve who requested the work and on whose behalf.

Add data product meaning. The graph should connect tables, columns, metrics, entities, owners, quality signals, and allowed use cases. That is how a policy decision can consider meaning instead of only location.

Connect lineage and provenance. If a derived data product includes restricted source data, the access decision should know that. If an agent answer is based on stale context, the system should know that too.

What breaks first

Policy checks allow access to a derived asset without understanding restricted upstream lineage.
Agents use data for a purpose that the original consent or governance model did not allow.
Service accounts hide the difference between human, workflow, and agent activity.
Access denials are correct but impossible to explain to the person trying to do the work.

Questions to ask

Use these questions when designing context-aware access.

Which identities, delegations, and purposes are represented?
Which policies depend on lineage, sensitivity, freshness, or domain meaning?
Can the system explain why access was allowed or denied?
Can an agent inspect the same context before acting?
Which facts must be current for the decision to remain valid?

For the architecture map, read The AI Context Layer, Why Agents Need Governed Data Access, and Policy Enforcement in Open Data Systems.

Sources to start with

The useful sources are provenance, lineage, policy, and AI risk guidance. The value comes from connecting them.

ODI hub Article library Use the scorecard The context graph Context graph vs knowledge graph Access control in ODI

Get started with Apache Iceberg, today! Want to learn more? Visit https://www.opendatainfrastructure.com/