A query service can have perfect policy intent and still run the wrong work. The physical plan is where intent meets execution.

A governed query service needs proof

Apache DataFusion exposes query plans through EXPLAIN, and the documentation distinguishes logical plans from physical plans. The physical plan is the execution shape after optimization decisions, data organization, and hardware configuration enter the picture.

That makes physical plans useful evidence for governed query services. They can show whether filters push down, which files or partitions are scanned, where UDFs run, and which operators drive cost.

Physical plans show real work

The logical plan says what the query means. The physical plan says how the system intends to run it. In a governed service, that difference matters. Policy can look correct at the SQL boundary while the execution path still scans too much data, bypasses an expected filter, or hides a custom function that needs review.

DataFusion also documents optimizer behavior and runtime metrics. Those signals can turn a query service from a black box into something operators can inspect before and after production incidents.

Core idea: physical plans make governed query behavior reviewable instead of assumed.

Evidence belongs in review

Open Data Infrastructure should attach plan evidence to the data product contract. A query service can retain representative plans for approved queries, compare plan drift after schema or partition changes, and flag scans that violate workload budgets.

For adjacent context, read DataFusion policy-aware query services, DataFusion logical plans as policy evidence, and query engines in ODI.

What breaks first

  • The service logs SQL text but not the execution plan that actually ran.
  • Policy filters exist in application code but disappear before physical execution.
  • UDF behavior changes without plan review or owner approval.
  • Cost incidents are investigated through anecdotes instead of scan and operator evidence.

Questions to ask

Ask which plans are retained, which plan changes require review, and how the service compares scan behavior against data product promises. Ask whether the on-call engineer can see the physical plan when latency spikes.

Sources to start with

These primary sources anchor the technical claims in this guide.

A governed query service should be able to show its work.