Topic Hub

Open Data Infrastructure

A curated starting point for my writing on open standards, lakehouse architecture, AI-ready data systems, interoperability, and data ownership.

Explore ODI

Newsletters

Open Data + AI

Substack

Data + AI + Industry Insights. Deep dives into the latest trends in data engineering, AI systems, and open source technology.

Subscribe on Substack →

Your Daily Data

LinkedIn Newsletter

Your data news and insights to help you succeed as a data professional. Published weekly on LinkedIn. 30,000+ subscribers.

Subscribe on LinkedIn →

Blog Posts

LinkedIn

The Architecture of Agents: Building Open Data Infrastructure Today for the AI Factory of Tomorrow

Mar 19, 2026

AIAgentsOpen DataData Infrastructure
Read on LinkedIn →
Substack

The AI Survival Guide - Why Open Data Infrastructure Is No Longer Optional

Mar 18, 2026

AIOpen DataData Infrastructure
Read on Substack →
Fivetran

Governing Your Lakehouse with Fivetran Managed Data Lake Service

Mar 16, 2026

Data LakehouseData Governance
Read on Fivetran →
LinkedIn

Did Context Windows Just Kill RAG?

Mar 11, 2026

AIRAGContext Windows
Read on LinkedIn →
Substack

Apache Iceberg v4 - What It Means for Your AI Data Stack

Mar 10, 2026

Apache IcebergAIData Stack
Read on Substack →
LinkedIn

Storing Data History Without Regretting It

Mar 9, 2026

Data HistoryData Engineering
Read on LinkedIn →
Substack

Open Data Infrastructure - To Infinity and Beyond!

Feb 28, 2026

Open DataData InfrastructureApache Iceberg
Read on Substack →
LinkedIn

When Data Is Your Advantage, Vendors Hold It Hostage

Feb 26, 2026

Data StrategyVendor Lock-In
Read on LinkedIn →
Substack

Iceberg File Format API Release

Feb 24, 2026

Apache IcebergFile FormatsAPI
Read on Substack →
Substack

Apache DataFusion: A Data Engineer’s Guide to the Query Engine Reshaping How We Build Data Systems

Feb 23, 2026

Apache DataFusionQuery EngineData Engineering
Read on Substack →
Fivetran

The Impact of Apache Polaris Graduating to Top-Level Apache Project

Feb 19, 2026

Apache PolarisOpen Source
Read on Fivetran →
Medium

ODBC and JDBC: Row-Oriented Relics and the Rise of ADBC

Feb 17, 2026

ODBCJDBCADBC
Read on Medium →
LinkedIn

ICYMI - WTF is ADBC?

Feb 17, 2026

ADBCData Connectivity
Read on LinkedIn →
LinkedIn

Principles of Testing Quality in Data Pipelines

Feb 14, 2026

Data PipelinesTesting
Read on LinkedIn →
Medium

The Four Pillars of AI Systems and the Data They Need

Feb 12, 2026

AIData Strategy
Read on Medium →
LinkedIn

The Four Pillars of AI Systems

Feb 12, 2026

AIData Strategy
Read on LinkedIn →
LinkedIn

AI and the Rise of the Transactional Lakehouse

Feb 11, 2026

AIData Lakehouse
Read on LinkedIn →
LinkedIn

The Interoperability Tax — The Challenge of Building For Everyone

Feb 10, 2026

InteroperabilityData Engineering
Read on LinkedIn →
LinkedIn

Lance vs Vortex — What Data Engineers Need To Know

Jan 28, 2026

Data FormatsData Engineering
Read on LinkedIn →
Medium

Lance vs Vortex: What Data Engineers Need to Know

Jan 21, 2026

Data FormatsData Engineering
Read on Medium →
Medium

Is Apache Iceberg Melting?

Jan 20, 2026

Apache IcebergData Lakehouse
Read on Medium →
LinkedIn

Is Apache Iceberg Melting?

Jan 18, 2026

Apache IcebergData Lakehouse
Read on LinkedIn →
LinkedIn

Apache Polaris 1.3.0: EVERYTHING YOU NEED TO KNOW

Jan 14, 2026

Apache PolarisOpen Source
Read on LinkedIn →
LinkedIn

Bringing Vector Search into Lakehouse Catalogs: Lance Meets Apache Polaris

Jan 10, 2026

Vector SearchApache Polaris
Read on LinkedIn →
LinkedIn

Data Modeling Commandments: 10 Rules Every Data Engineer Must Follow

Jan 7, 2026

Data ModelingBest Practices
Read on LinkedIn →
LinkedIn

Apache Iceberg — Scaling Petabytes & People

Dec 18, 2025

Apache IcebergScalability
Read on LinkedIn →
LinkedIn

Monthly Merge Report — November 2025

Dec 15, 2025

Open SourceCommunity
Read on LinkedIn →
Fivetran

Data Pipeline State Management: An Underappreciated Challenge

Dec 10, 2025

Data PipelinesState Management
Read on Fivetran →
Fivetran

Monthly Merge Report for OSS Projects: November 2025

Nov 28, 2025

Open SourceCommunity
Read on Fivetran →
LinkedIn

Lakehouse Table Format Comparison (November 2025)

Nov 18, 2025

Data LakehouseTable Formats
Read on LinkedIn →
Fivetran

Getting Started with Apache Polaris Catalog in Fivetran's Managed Data Lake Service

Nov 12, 2025

Apache PolarisData Lakehouse
Read on Fivetran →
LinkedIn

How Apache Iceberg Delivers ACID Transactions on Data Lakes

Nov 10, 2025

Apache IcebergACID
Read on LinkedIn →
LinkedIn

Monthly Merge Report — October 2025

Nov 5, 2025

Open SourceCommunity
Read on LinkedIn →
Fivetran

Introducing Fivetran Init: Accelerate Custom Connector Development

Oct 22, 2025

ConnectorsDeveloper Tools
Read on Fivetran →
LinkedIn

Apache Polaris 1.2.0 Release — What You Need To Know!

Oct 18, 2025

Apache PolarisOpen Source
Read on LinkedIn →
LinkedIn

Build a Free Data Lakehouse with Fivetran, S3, Apache Iceberg, and DuckDB

Oct 10, 2025

Data LakehouseTutorial
Read on LinkedIn →
Fivetran

Get Started with Iceberg Without the Brain Freeze

Oct 8, 2025

Apache IcebergGetting Started
Read on Fivetran →
Tobiko Data

Monthly Merge Report — September 2025

Sep 30, 2025

Open SourceCommunity
Read on Tobiko Data →
Tobiko Data

Monthly Merge Report — August 2025

Aug 31, 2025

Open SourceCommunity
Read on Tobiko Data →
Tobiko Data

Open Source AI with Spice + SQLMesh

Aug 18, 2025

AISQLMeshOpen Source
Read on Tobiko Data →
LinkedIn

Data Modeling vs Data Transformation

Aug 18, 2025

Data ModelingData Transformation
Read on LinkedIn →
LinkedIn

SQLMesh + SPICE.AI OSS for AI-Ready Data

Aug 12, 2025

SQLMeshAI
Read on LinkedIn →
Tobiko Data

Shipped: Dev-Only VDE Mode

Aug 5, 2025

SQLMeshDeveloper Tools
Read on Tobiko Data →
LinkedIn

Build an Open Lakehouse on Your Laptop with DuckLake + SQLMesh

Aug 5, 2025

DuckLakeSQLMeshTutorial
Read on LinkedIn →
Tobiko Data

K-means Clustering with SQLMesh Python Models

Jul 22, 2025

SQLMeshMachine Learning
Read on Tobiko Data →
Tobiko Data

DuckLake + SQLMesh Tutorial: Build a Modern Data Lakehouse on Your Laptop

Jul 14, 2025

SQLMeshData LakehouseTutorial
Read on Tobiko Data →
Tobiko Data

Monthly Merge Report — July 2025

Jul 1, 2025

Open SourceCommunity
Read on Tobiko Data →
Tobiko Data

SQLMesh FLOW Mode: Onboarding Reimagined

Jun 18, 2025

SQLMeshDeveloper Experience
Read on Tobiko Data →
LinkedIn

SQLMesh — The Key to Your AI Strategy

Jun 18, 2025

SQLMeshAIStrategy
Read on LinkedIn →
Tobiko Data

SQLMesh Delivers 22x Faster Data Transformation and 10x Cost Savings vs dbt Core on Snowflake

Jun 4, 2025

SQLMeshPerformanceSnowflake
Read on Tobiko Data →
Medium

Hands-On SQLMesh Using the VS Code Extension

May 31, 2025

SQLMeshVS CodeTutorial
Read on Medium →
Tobiko Data

SQLMESH PLAN --EXPLAIN: Understand Your Changes Before They Happen

May 20, 2025

SQLMeshDeveloper Tools
Read on Tobiko Data →
LinkedIn

DuckLake vs Apache Iceberg: Which One Is Right For You?

May 20, 2025

DuckLakeApache Iceberg
Read on LinkedIn →
Medium

SQLMesh Sushi — An End-to-End Pipeline Tutorial

May 17, 2025

SQLMeshTutorial
Read on Medium →
Medium

Managing Temporal Change in Data Pipelines

May 16, 2025

Data PipelinesBest Practices
Read on Medium →
LinkedIn

Hands-On SQLMesh Using The VS Code Extension

May 15, 2025

SQLMeshVS CodeTutorial
Read on LinkedIn →
Medium

The Evolution of Data Pipelines: From ETL to ELT and Beyond!

May 13, 2025

ETLELTData Pipelines
Read on Medium →
Medium

End-to-End Tutorial: Integrating SQLMesh with DLT

May 12, 2025

SQLMeshdltTutorial
Read on Medium →
LinkedIn

SQLMesh Sushi Hands-On Tutorial — From Setup to Running Your First Pipeline

May 10, 2025

SQLMeshTutorial
Read on LinkedIn →
Tobiko Data

Is dbt Fusion the Death of dbt Core?

May 6, 2025

dbtData Transformation
Read on Tobiko Data →
Medium

Versioning Data Transformations with SQLMesh: Trust, Speed, and Control

Apr 25, 2025

SQLMeshData Transformation
Read on Medium →
Tobiko Data

A Perspective on Speed and State

Apr 22, 2025

Data EngineeringArchitecture
Read on Tobiko Data →
Medium

End-to-End Data Engineering Project with DuckDB + SQLMesh

Apr 19, 2025

DuckDBSQLMeshTutorial
Read on Medium →
Medium

Data Modeling: A Guide for Data Analysts

Apr 19, 2025

Data ModelingAnalytics
Read on Medium →
Medium

Idempotent Data Pipelines with Spark and SQLMesh

Apr 19, 2025

SparkSQLMeshIdempotency
Read on Medium →
Medium

Backfilling Data Pipelines: Concepts, Examples, and Best Practices

Apr 19, 2025

Data PipelinesBest Practices
Read on Medium →
Medium

SQLMesh Incremental Modeling with DuckDB: A Hands-On Tutorial

Apr 19, 2025

SQLMeshDuckDBTutorial
Read on Medium →
Tobiko Data

Expanding Apache Airflow with Tobiko Cloud

Apr 8, 2025

Apache AirflowOrchestration
Read on Tobiko Data →
Tobiko Data

SQLMesh Model Blueprinting in Practice: A Hospital Network Example

Mar 25, 2025

SQLMeshBest Practices
Read on Tobiko Data →
Dremio

Key Takeaways from the 2025 State of the Data Lakehouse Report

Mar 10, 2025

Data LakehouseIndustry Report
Read on Dremio →
Medium

Storytelling with Power — What Data Engineers Can Learn from D&D

Mar 8, 2025

CommunicationData Engineering
Read on Medium →
Medium

“Data Mesh” or “Data Mess”?

Feb 24, 2025

Data MeshArchitecture
Read on Medium →
Dremio

The Evolution of the Modern Data Team

Feb 24, 2025

Data TeamsStrategy
Read on Dremio →
Medium

Enterprise Data Catalogs and Technical Metadata Catalogs: A Practical Guide

Feb 23, 2025

Data CatalogsMetadata
Read on Medium →
Medium

The Evolution of Data Storage

Feb 20, 2025

Data StorageData Warehousing
Read on Medium →
Medium

A/B Tests for Data Analysts

Feb 18, 2025

A/B TestingAnalytics
Read on Medium →
Medium

2025 AI Insights Report — What You Need to Know

Feb 17, 2025

AIIndustry Report
Read on Medium →
Medium

The Future of Data Careers: Skills You Need to Succeed in 2025

Feb 17, 2025

CareersData Engineering
Read on Medium →
Medium

Your Dashboard Sucks — And How to Fix It

Feb 15, 2025

DashboardsData Visualization
Read on Medium →
Medium

Your SQL Performance Sucks — And How to Fix It

Feb 13, 2025

SQLPerformance
Read on Medium →
Medium

The Evolution of Data Processing: From ETL to ELT

Feb 12, 2025

ETLELTData Processing
Read on Medium →
Dremio

Understanding Data Mesh and Data Fabric: A Guide for Data Leaders

Feb 10, 2025

Data MeshData Fabric
Read on Dremio →
Dremio

Simplifying Data Discovery with the Dremio Connector for Alation

Jan 27, 2025

Data DiscoveryPartnerships
Read on Dremio →
Dremio

Understanding Dremio's Architecture: A Game-Changing Approach to Data Lakes and Self-Service Analytics

Jan 13, 2025

ArchitectureSelf-Service Analytics
Read on Dremio →
Dremio

Why Your Data Strategy Needs Data Products

Dec 18, 2024

Data ProductsStrategy
Read on Dremio →
Dremio

Dremio and Monte Carlo — Enhanced Data Reliability for Your Data Lakehouse

Dec 4, 2024

Data ReliabilityPartnerships
Read on Dremio →
Dremio

A Data Analyst's Guide to JDBC, ODBC, REST, and Arrow Flight

Nov 20, 2024

ConnectivityArrow Flight
Read on Dremio →
Dremio

Lakehouse Architecture for Unified Analytics — A Data Analyst's Guide

Nov 6, 2024

Data LakehouseAnalytics
Read on Dremio →
Dremio

Unified Semantic Layer: A Modern Solution for Self-Service Analytics

Oct 23, 2024

Semantic LayerSelf-Service Analytics
Read on Dremio →
Dremio

Dremio vs. Denodo — A Comparison

Oct 9, 2024

Product ComparisonData Virtualization
Read on Dremio →