Open Data Infrastructure
SQLGlot SQL Normalization for Agent Review
How SQLGlot normalization can make agent-generated SQL reviewable across dialects, lineage checks, policy review, and migration tests.
Agent-generated SQL is dangerous when review only checks whether the query runs.
Agent SQL needs reviewable structure
Agents can produce SQL quickly. That speed is useful until the query crosses dialects, hides implicit identifiers, reads the wrong table, or bypasses a policy expectation. Human review of raw SQL is fragile because formatting, aliases, and dialect quirks can hide the actual operation.
SQL normalization helps because it turns review into a structured process. The goal is not to make every query identical. The goal is to expose identifiers, relationships, expressions, and dialect-specific behavior in a form tools and humans can inspect.
SQLGlot gives SQL an AST
SQLGlot describes itself as a no-dependency SQL parser, transpiler, optimizer, and engine. Its project docs and source describe parsing SQL into an abstract syntax tree, translating between dialects, and optimizer steps such as qualifying tables and columns.
For agent review, that means a platform can parse generated SQL, normalize identifiers, compare dialect output, inspect table access, and run lineage checks before execution. That is stronger than linting for style or trusting the model to explain the query.
Core idea: agent SQL review should inspect structure, not just strings.
Normalization supports controls
For related ODI patterns, read SQLGlot expression trees for governance review, SQLGlot parser coverage and migration risk, and data modeling for tool-calling agents.
A review pipeline can reject unknown tables, flag cross-domain joins, require explicit column selection, compare source and target dialects, and attach lineage output to the approval record. The important part is making those checks repeatable.
What breaks first
- The query is formatted nicely, but table identity is still ambiguous.
- Dialect translation succeeds syntactically while semantics change.
- Policy review happens after the query already touched data.
- Lineage checks read raw strings instead of parsed structure.
Review questions
Ask whether every generated query is parsed, normalized, policy-checked, lineage-checked, and stored with the agent request. If the answer is no, execution is outrunning governance.
Sources to start with
These primary sources anchor the technical claims in this guide.
- SQLGlot GitHub repository
- SQLGlot onboarding documentation
- SQLGlot qualify optimizer source
- OpenLineage object model documentation
Reviewable SQL is not slower SQL. It is SQL with enough structure to trust.