Spark Is Not Lazy. Spark Compiles Dataflow.

Spark Is Not Just Lazy. Spark Compiles Dataflow.

Why calling Spark ’lazy’ is technically reductive, and how thinking of it as a dataflow compiler changes the way you design pipelines.

November 3, 2025 · 12 min
Fixing Skewed Nested Joins in Spark with Asymmetric Salting

Fixing Skewed Nested Joins in Spark with Asymmetric Salting

In large-scale Spark pipelines, skew can occur when a single key carries a disproportionately large nested payload. Asymmetric salting offers a targeted solution: explode, salt, join in parallel, and optionally re-aggregate.

December 1, 2025 · 17 min