The Disaggregation of the Lakehouse Stack

The Disaggregation of the Lakehouse Stack

How Delta Kernel, Arrow, and pluggable execution are disaggregating the lakehouse stack. The lakehouse stack is not converging on a new dominant engine — it is converging on a layered architecture in which protocol, data representation, and query execution are increasingly isolated behind stable interfaces.

March 8, 2026 · 15 min
Where Data System Abstractions Break: A Semiotic Reading

Where Data System Abstractions Break: A Semiotic Reading

Many of the most surprising performance pathologies in modern data systems are semiotic failures — structural divergences between what an interface signifies and what the underlying system does.

March 4, 2026 · 13 min
Spark Is Not Lazy. Spark Compiles Dataflow.

Spark Is Not Just Lazy. Spark Compiles Dataflow.

Why calling Spark ’lazy’ is technically reductive, and how thinking of it as a dataflow compiler changes the way you design pipelines.

November 3, 2025 · 12 min
Fixing Skewed Nested Joins in Spark with Asymmetric Salting

Fixing Skewed Nested Joins in Spark with Asymmetric Salting

In large-scale Spark pipelines, skew can occur when a single key carries a disproportionately large nested payload. Asymmetric salting offers a targeted solution: explode, salt, join in parallel, and optionally re-aggregate.

December 1, 2025 · 17 min