If you say "we are open because we store Parquet," you are describing storage. You are not describing ownership.

What a file format defines

A file format defines how bytes are laid out in a file. Parquet, ORC, and Avro are file formats. They describe compression, encoding, and how readers interpret the content.

A file format does not define table-level behavior: snapshots, schema evolution rules, partition evolution, deletes, and time travel.

What a table format defines

A table format defines the behavior of a table across a collection of files. It defines metadata structures and rules for how writers commit changes and how readers interpret table state.

Iceberg is a table format. It defines snapshots, manifests, schema evolution, partition evolution, time travel, and row-level deletes in a way that can be implemented by multiple engines.

Core idea: file formats make data readable. Table formats make data governable and portable.

Why the distinction matters for ODI

ODI is about portability with meaning. That requires a table contract, not only readable files. If you store Parquet but rely on a vendor-specific metastore and a vendor-specific commit protocol, you still do not own the table behavior.

Open table formats are the contract layer that lets multiple engines participate without losing semantics.

Common mistakes

These mistakes show up everywhere:

  • Assuming "file open" means "platform open": it does not.
  • Ignoring metadata portability: the catalog and governance layer is where lock-in often lives.
  • Skipping time travel and audit: a table contract is not real if you cannot inspect and roll back history.

If you want to own the data product, you need open behavior, not only open bytes.

Sources to start with

Start with the file and table specifications.