An agent sandbox without a table-state receipt is just a fancy way to forget what the agent actually saw.

Agent sandboxes need receipts

Agent experiments usually start with a reasonable promise. Let the agent try a query, test a hypothesis, or propose a transformation away from production. That promise fails when the sandbox reads a moving table and nobody records the table state.

Apache Iceberg gives teams a better primitive. The table specification defines snapshots as immutable table states, and branches and tags can point at snapshots by name. That makes the sandbox boundary concrete: the agent can read a known state instead of whatever production happens to look like at execution time.

Snapshot references give names to evidence

A branch is useful when the agent needs an isolated workspace for proposed changes. A tag is useful when the agent needs a stable reference for evaluation, replay, or incident review. Neither feature is magic governance, but both make governance possible because the table state has a durable identifier.

Core idea: the useful control is not that an agent used Iceberg. The useful control is that every read, test, and promotion can point back to an exact snapshot reference.

The ODI pattern

The practical pattern is simple. Create a reviewed reference for the sandbox. Bind the agent tool to that reference. Record the snapshot, branch, request, policy decision, and output. If a human promotes a change later, keep that promotion tied to the same evidence chain.

For related patterns, read Iceberg branches for agent experiments, Iceberg change audit logs, and agentic data write approval queues.

What breaks first

  • The sandbox reads the latest table state without storing the snapshot ID.
  • A branch exists, but policy decisions are logged in a separate tool with no table reference.
  • Evaluation output is stored, but the source snapshot expires before review.
  • Promotion copies data forward without preserving the sandbox evidence.

Sandbox review questions

Ask whether the agent can prove which snapshot it read, which branch it wrote to, which policy allowed the request, and which human approved promotion. If the answer depends on a chat transcript, the sandbox is not governed yet.

Sources to start with

These primary sources anchor the technical claims in this guide.

A sandbox is only useful when the evidence survives the experiment.