Deleting old table state is easy to frame as hygiene. That framing is dangerous when the old state is also recovery evidence.

Cleanup is a governance action

Lakekeeper documents table maintenance, including snapshot expiration based on configurable age and retention policies. It also documents concepts such as soft deletion and recovery behavior. Those are operational features with governance consequences.

Table cleanup decides how long teams can recover, audit, compare, and explain previous data states. That means expiration policy belongs in the data product contract, not only in an administrator setting.

Expiration policy has business meaning

A useful expiration policy names the recovery window, retention requirement, table owner, maintenance cadence, exception path, and audit record. It should explain whether the table supports rollback, regulatory review, incident replay, or only storage control.

Iceberg snapshot expiration can reduce metadata and storage pressure. The platform still has to decide which snapshots are disposable and which are evidence.

Core idea: governed cleanup preserves the reason for retention before it removes the bytes.

Table maintenance needs a record

Open Data Infrastructure should connect Lakekeeper maintenance settings to catalog ownership, data product SLAs, lineage, and incident review. Cleanup jobs should leave enough evidence for a human to answer why a snapshot expired and who approved the policy.

For adjacent context, read Lakekeeper open catalog operations, Lakekeeper backup and recovery, and snapshot expiration and retention policy.

What breaks first

  • Expiration policy is set globally, but tables have different recovery requirements.
  • Data product owners discover retention changes only after an incident.
  • Cleanup removes the exact snapshot needed to explain a quality regression.
  • Maintenance logs exist but do not connect to ownership or approval evidence.

Questions to ask

Ask how long each table needs recovery, who approves exceptions, and which evidence survives expiration. Ask whether the platform can prove that cleanup followed the data product contract.

Sources to start with

These primary sources anchor the technical claims in this guide.

Cleanup is safe when the policy can explain what it destroyed.