Semantic Models

The word semantic is rather heavily used (often incorrectly) when discussing data models. The word semantic itself is an adjective relating to meaning in language or logic. When we think…

DuckLake

There are some great names in software engineering and this is most definitely one of them. The lakehouse architecture promised to combine the best of data warehouses and data lakes.…

Why Iceberg

Iceberg has gained a huge amount of popularity in recent years, but why is this table format now finding such widespread adoption? There are a number of reasons and I…

Consensus Algorithms

Ever tried to arrange anything with friends via group chat? Messages arrive out of order, some people don’t respond, and others change their minds. Now swap friends with computers and…

Cantor & Codd

If you’ve ever queried a database with SQL (Structured Query Language) then you’ve been standing on the mathematical shoulders of Georg Cantor. His 19th-century work on set theory laid the…

The 6 Pillars

Not a Wu-Tang Clan song. This is about Data Quality. Every organization claims to want high-quality data, but when pressed to define what that means, the conversation becomes vague. They…

Compression Codecs

Memory is cheap, but it ain’t free. In the world of modern data engineering, compression is everywhere. It’s in your Parquet files, your Kafka messages, your database storage engine, your…