Open Data Infrastructure
SQLGlot and the Case for Portable SQL
Why SQL portability matters in ODI, and how SQLGlot helps you parse, analyze, and transpile SQL across engine dialects.
If you build ODI and keep the data portable, you will eventually discover a new kind of lock-in. It is not storage lock-in. It is SQL dialect lock-in.
Why it matters
The more engines you add to a stack, the more your SQL becomes an integration surface. That is true for transformations, tests, dashboards, and ad hoc analysis.
A warehouse-centric platform can pretend dialect does not matter because everything runs in one place. ODI cannot. ODI needs SQL that can travel, or a tooling layer that can translate and validate it.
The ODI angle
SQLGlot is a SQL parser and transpiler. In ODI terms, it can be a translation layer that helps you parse queries, analyze ASTs, extract lineage, and translate some dialect differences.
It does not make portability automatic. Semantics differ across engines. A translated query can be syntactically valid and still wrong.
What SQLGlot gives you is a place to centralize the discipline: define what "portable" means, test it, and keep it in CI instead of in tribal knowledge.
Core idea: in a multi-engine world, SQL is infrastructure. Treat parsing and validation like infrastructure too.
The architecture test
For analytics engineers, the test is whether you can move workloads without rewriting meaning.
- Define which queries must be portable and which are allowed to be engine-specific.
- Pick a dialect subset and enforce it with automated checks.
- Validate translation with result comparisons on representative data sets.
- Use AST-level lineage extraction to attach columns and tables to downstream metrics.
- Keep translation tests in CI, not in a migration spreadsheet.
What breaks first
This breaks when "portability" stays a slogan instead of a tested contract.
- You translate SQL and assume it is equivalent without validating results.
- The first migration reveals thousands of hidden dialect quirks.
- Lineage extraction is an afterthought instead of part of code review.
- Teams adopt translation tools but never define the allowed subset of SQL.
Questions to ask
Use these questions when you evaluate SQLGlot portable SQL for multi-engine portability.
- Which dialects do you need to support in practice?
- Which semantics matter most for your business logic, like NULL handling and date math?
- How will you validate equivalence when you transpile?
- Can you extract table and column lineage programmatically from your SQL?
- Where do you draw the line and accept engine-specific SQL?
If you cannot answer those questions, your portability plan is "we will figure it out later." Later is where migration budgets go to die.
Sources to start with
Start with SQLGlot itself, then look at the engine dialects you depend on.