Spark Is Not Lazy. Spark Compiles Dataflow.

Spark Is Not Just Lazy. Spark Compiles Dataflow.

Why calling Spark ’lazy’ is technically reductive, and how thinking of it as a dataflow compiler changes the way you design pipelines.

November 3, 2025 · 12 min
A Shared Kernel Is a Shared Trust Domain

A Shared Kernel Is a Shared Trust Domain

Containers isolate processes, not trust boundaries. When your platform runs untrusted code, the architectural question is where you place the kernel boundary, and what that costs in memory, latency, and operational complexity.

February 2, 2026 · 18 min
Fixing Skewed Nested Joins in Spark with Asymmetric Salting

Fixing Skewed Nested Joins in Spark with Asymmetric Salting

In large-scale Spark pipelines, skew can occur when a single key carries a disproportionately large nested payload. Asymmetric salting offers a targeted solution: explode, salt, join in parallel, and optionally re-aggregate.

December 1, 2025 · 17 min
Two Algorithms, One Intuition: Shunting Yard and Pratt Parsing

Two Algorithms, One Intuition: Shunting Yard and Pratt Parsing

Parsing arithmetic expressions looks simple… until precedence enters the picture. Two classic algorithms — Dijkstra’s Shunting Yard and Pratt’s Top-Down Operator Precedence — provide radically different answers that reveal the same underlying intuition.

October 6, 2025 · 12 min
Inside GKE Workload Identity: How Kubernetes Identities Become GCP Service Accounts

Inside GKE Workload Identity: How Kubernetes Identities Become GCP Service Accounts

GKE Behind the Scenes: Understanding the Interaction Between Kubernetes and GCP Service Accounts Through The Metadata Server.

August 1, 2024 · 9 min