Open Data Infrastructure
Knowledge Graphs and Open Data Infrastructure
Knowledge graphs add relationships and meaning over governed data. They should extend ODI, not replace the open table and catalog contracts underneath.
A knowledge graph is not a magic layer you pour over messy data until it becomes intelligent.
Graphs do not fix weak data contracts
Knowledge graphs are useful because relationships matter. Customers belong to accounts. Products move through supply chains. Policies apply to domains. Metrics depend on entities. AI systems need those relationships to reason about context.
But a graph over ungoverned, stale, poorly documented data is just a more connected mess. ODI still has to provide the durable table, catalog, lineage, and policy contracts underneath.
Core idea: a knowledge graph should connect governed data products, not become a new hiding place for data debt.
The graph adds relationships, not storage magic
Open tables are good at durable, analytical state. Catalogs are good at identity, discovery, and operations. Lineage systems are good at recording how data moved and changed. A knowledge graph adds a relationship layer over those assets.
That relationship layer can represent entities, business concepts, policy scopes, ownership, lineage, and semantic mappings. It does not need to store every fact as a graph triple to be valuable. It needs to make relationships machine-readable and governable.
AI needs graph context with provenance
Agents need more than retrieval. They need context that tells them which entities are related, which definitions apply, which sources are trusted, and which policies restrict use. A graph can help with that because it makes paths through context explicit.
Provenance is the difference between useful graph context and confident nonsense. If an agent uses a relationship, the platform should be able to explain where that relationship came from and when it was last validated.
A practical ODI graph architecture
A grounded architecture usually has five layers:
- open tables for durable facts and history
- a catalog for table identity, ownership, and access
- lineage for source and transformation paths
- a graph layer for entities, relationships, definitions, and policies
- a governed retrieval layer for agents and applications
The graph should read from and point back to governed sources. If the graph becomes the only source of truth, it inherits the same lock-in problem ODI was meant to avoid.
Start with one business relationship that matters
Do not start with an enterprise ontology project. Start with one relationship that causes real failures: account to user, product to supplier, claim to policy, patient to encounter, or metric to source table.
Make that relationship governed, sourced, versioned, and available to one workflow. Then expand. Knowledge graphs get useful when they make real decisions safer, not when the diagram looks impressive.
Sources to start with
Use W3C standards and lineage docs to ground graph claims in provenance, identifiers, and machine-readable relationships.