Open Data Infrastructure
Compute Cost Portability: Why You Should Be Able to Switch Engines
Decoupling data from compute pricing is one of the biggest cost control moves open infrastructure enables.
If you cannot switch query engines without rewriting your data contract, you do not have a data platform. You have a long-term pricing commitment.
The pricing trap
Most data platforms are sold as “pay for what you use.” The trap is that “what you use” becomes tightly coupled to one engine, one catalog, and one vendor-defined set of performance knobs. Over time, the architecture adapts to the billing model.
When a new workload arrives, the unit economics change. AI workloads are the clearest example, but it also happens with interactive analytics, streaming, and high-concurrency operational use cases. If the only viable path is “use more of the same engine,” your cost strategy is not a strategy.
What compute portability means
Compute portability is the ability to run multiple engines against the same governed data contracts without breaking semantics, permissions, or operational expectations.
It is not “any engine can run any query.” It is “we can choose the right engine for a workload, and the governance and metadata still hold.”
Core idea: portability is not about convenience. It is about negotiation power and cost control.
Requirements for real portability
Portability is won or lost in the contract layer.
- Open table formats: table metadata lives with the data, not inside a proprietary engine boundary.
- Open catalog boundaries: discovery, access, and table operations are available through a stable interface.
- Consistent governance: permissions and policy enforcement must apply across engines, not only within one UI.
- Portable semantics: schema evolution, partitioning, and time travel need clear rules that engines implement consistently.
- Operational ownership: compaction, retention, and maintenance tasks have explicit ownership outside any one engine.
If any of these are engine-specific, portability becomes a migration project rather than a configuration choice.
Tradeoffs and failure modes
Compute portability is not free. It is an architecture choice that pushes complexity into the contract layer.
- Semantic drift: the same SQL does not always mean the same thing across engines.
- Policy gaps: one engine may enforce row-level policy, another may not.
- Operational split brain: table maintenance done by multiple systems can conflict without clear ownership.
- Hidden engine coupling: “works with Iceberg” can still mean “works with our preferred catalog path.”
Those risks are manageable when the organization treats contracts as first-class infrastructure.
The portability test
Ask this question:
If we add a second engine next quarter, what still works without rewriting?
If the answer is “the files, but not the table history,” portability is partial. If the answer is “the tables, but not the policies,” governance is not portable. If the answer is “nothing,” you are buying a platform that can only be escaped by re-platforming.
Compute portability is not about using more engines. It is about being able to choose.
Sources to start with
Start with the specs that make table and catalog behavior portable.