Open Data Infrastructure
dbt Core Source Contracts for AI-Ready Lineage
How dbt source definitions, freshness, tests, exposures, and metadata can support AI-ready lineage without inventing magic.
Agents do not need more table names. They need to know which source tables are safe to trust.
AI-ready lineage starts upstream
Lineage that starts only at the transformed model is late. By then, the system may know what produced a dashboard, but it may not know whether the upstream source was fresh, tested, documented, or owned.
dbt Core can help because source definitions create a named place for upstream data in the DAG. Source freshness, tests, model contracts, and exposures add more context around how data is consumed and what shape downstream models promise to return.
dbt sources define source context
dbt documentation describes sources as a way to add source data to the DAG. It also documents source freshness for checking whether source data is meeting a defined freshness expectation. Exposures describe downstream uses such as dashboards, applications, or data science pipelines.
Those features do not automatically create "source contracts" as a formal dbt resource. The safer ODI framing is that teams can assemble a source-side contract from source definitions, freshness rules, tests, ownership metadata, and downstream exposure context.
Core idea: AI-ready lineage begins where data enters the modeled system, not where the agent first asks a question.
Call it a contract carefully
For related ODI patterns, read dbt Core contracts and catalog drift, dbt source freshness and data product SLAs, and AI-ready data entitlement drift detection.
A useful source contract should tell the agent which upstream table is authoritative, how fresh it is, which tests passed, who owns it, and which downstream exposures depend on it. That context should travel into the metadata system so humans and agents inspect the same graph.
What breaks first
- Sources are named, but freshness thresholds are not defined.
- Tests exist on models while source quality assumptions stay implicit.
- Exposures describe dashboards but not agent-facing applications.
- Catalog metadata drifts away from dbt project metadata.
Lineage questions
Ask whether a source table has an owner, freshness rule, tests, downstream exposure map, and catalog identity. If not, the agent sees a table. It does not see a trustworthy source.
Sources to start with
These primary sources anchor the technical claims in this guide.
- dbt sources documentation
- dbt source freshness documentation
- dbt contract configuration documentation
- dbt exposures documentation
Lineage is AI-ready when the source context is as inspectable as the model output.