A lakehouse is what happened when teams got tired of choosing between cheap, open files and managed table behavior. Fair enough.

A lakehouse combines lake storage with table behavior

The clean definition is this: a lakehouse stores data in open or semi-open cloud/object storage and adds table-level behavior so analytics and data applications can use it reliably. That behavior includes schemas, snapshots, transactions, governance hooks, and engine access.

The point is not that every lakehouse is automatically open. The point is that the lakehouse pattern separates storage from compute more than the classic warehouse pattern did, then adds enough metadata to make the lake usable for serious workloads.

The word got stretched by marketing

Like every useful architecture term, lakehouse got turned into a product label. Some people use it to mean Iceberg tables on object storage. Some mean a vendor platform. Some mean "warehouse plus cheaper storage." That fuzziness is why the ODI lens helps.

Ask what is open, what is portable, what is governed, and what another engine can safely use. The answer matters more than the label.

Core idea: lakehouse is an architecture pattern. ODI is the control model that decides whether the pattern stays open.

The lake and warehouse both left gaps

Traditional data lakes gave teams storage control but weak guarantees. Warehouses gave teams strong query behavior but usually inside one platform boundary. Lakehouse architecture tries to combine the useful parts: data in open storage, table semantics, engine choice, and governance.

That is the promise. The implementation still decides whether the promise is real.

Questions that cut through the noise

  • Who owns the physical data?
  • Which table format carries metadata and history?
  • Which catalog coordinates table operations and permissions?
  • Which engines can participate without copying data?

If those answers are closed, the lakehouse may still be useful. It just is not open infrastructure yet.

Sources to start with

These are the primary sources I would start from when checking the claims in this piece.