Open Data Infrastructure
Lakekeeper Multi-Tenant Iceberg Catalogs
Operational questions for Lakekeeper multi-tenant Iceberg catalogs across warehouses, identity, audit, and recovery.
Multi-tenant catalogs fail in the spaces between tenants, not inside the happy path for one table.
The practical problem
Lakekeeper is an Apache Iceberg REST catalog with governance and authorization features. That makes it interesting for teams that want central catalog operations without handing every workload to a single closed platform.
Multi-tenancy raises the bar. A catalog can serve many teams, projects, warehouses, and namespaces, but the operational model has to prove isolation, recovery, and auditability before the platform becomes shared infrastructure.
Tenant boundaries need names
The first design decision is not technical. It is the tenant boundary. Is a tenant a business unit, product domain, environment, customer, legal entity, or workload class? The catalog structure should make that answer visible.
Lakekeeper documentation describes warehouses, projects, authentication, authorization, and policy integrations. Those features become useful only when the platform team maps them to the boundaries the organization actually operates.
Core idea: multi-tenant catalog design is an isolation contract before it is a namespace layout.
Operational controls decide whether sharing works
A multi-tenant catalog needs identity controls, role assignment workflows, audit records, recovery tests, and drift checks. Authorization backends such as OpenFGA or Cedar can help represent permissions, but operations still need procedures for bootstrap, backup, restore, reconciliation, and incident response.
The same applies to engines. A shared query engine that reaches the catalog with broad credentials can undo the tenant model. Lakekeeper OPA bridge documentation is a reminder that compute enforcement and catalog enforcement are different patterns, and each carries different trust assumptions.
What breaks first
- Project and warehouse boundaries reflect deployment history instead of tenant risk.
- Engine credentials flatten tenant permissions into one privileged path.
- Backup and restore test the catalog database but not authorization state.
- Audit logs show API calls but not tenant impact.
Questions to ask
- What is the tenant boundary, and where is it encoded?
- Which identities administer tenants versus read tenant data?
- How are authorization state and catalog state reconciled after restore?
- Can one tenant incident be contained without freezing the catalog?
For adjacent operations, read Lakekeeper and Open Catalog Operations, Lakekeeper Backup and Recovery for Iceberg Catalogs, and Apache Polaris and Lakekeeper Catalog Operations.
Sources to start with
These primary sources anchor the technical claims in this guide.
- Lakekeeper documentation
- Lakekeeper authorization
- Lakekeeper OPA bridge
- Apache Iceberg REST Catalog specification
A shared catalog works only when tenant isolation is boring enough to trust.