Open Data Infrastructure
Open Lakehouse Benchmark Design for ODI
How ODI benchmarks should test workload fit, interoperability, governance, metadata behavior, and exit paths.
A benchmark that only measures query speed is a benchmark for buying faster tunnel vision.
The practical problem
Open lakehouse decisions often lean on benchmark numbers. Query latency matters. Cost matters. Concurrency matters. But an ODI benchmark that stops there misses the point of open infrastructure.
Open Data Infrastructure is about control, portability, governance, and ecosystem fit. A benchmark should test those properties directly.
Benchmark the workload and the boundary
A useful benchmark starts with real workload classes: ingestion, streaming writes, ad hoc analytics, dashboard serving, model feature retrieval, data product APIs, maintenance, and recovery. Each class should name the engines, catalog path, table format behavior, policy requirements, and freshness expectation.
Then test interoperability. Can another engine read the table without migration? Does schema evolution behave consistently? Do snapshots, manifests, and metadata tables provide enough evidence for operations? Can lineage and policy travel with the workload?
Core idea: an ODI benchmark should measure whether the architecture preserves control under change.
Evidence beats leaderboard thinking
The benchmark output should include more than a chart. It should include table layout, catalog configuration, query plans, policy decisions, lineage events, maintenance results, and exit steps. That is how teams compare architectures instead of isolated engines.
This matters for systems such as DuckDB, DataFusion, StarRocks, Doris, Flink, and Iceberg because each tool can be excellent in the right layer. The benchmark should show the layer, not crown a universal winner.
What breaks first
- The benchmark runs one engine against one happy-path table and calls the result architectural truth.
- Governance is tested manually after performance results are already accepted.
- Metadata behavior, maintenance, and recovery are excluded because they are harder to measure.
- Exit paths are discussed in procurement but never benchmarked.
Questions to ask
- Which workload classes does the benchmark represent?
- Which interoperability claims are tested with a second engine?
- Which governance controls run during the benchmark?
- Can the platform prove an exit path with preserved data, metadata, and policy?
For adjacent architecture, read DuckDB, DataFusion, StarRocks, and Doris in ODI, Table Formats, Catalogs, and Query Engines, and Trino vs Spark vs DuckDB.
Sources to start with
These primary sources anchor the technical claims in this guide.
- Apache Iceberg table specification
- Apache DataFusion documentation
- StarRocks Iceberg catalog
- Apache Doris Multi Catalog
The benchmark should prove the architecture, not just flatter the engine.