Technical analyses of system internals, runtime behavior, and implementation details. These articles examine how things actually work in practice, especially under scale and operational pressure.
Unpacking Parquet: Explicit SIMD, Scalar Baselines, and What HotSpot Makes of Them
On the JVM, optimizing a hot kernel is not only about writing faster code: it is also about understanding how much the result depends on the machine code HotSpot derives from the scalar loop. Using Parquet bit-unpacking as a concrete case, the piece shows that a SIMD speedup depends on which scalar baseline C2 is handed, when explicit vectorization is actually justified, and why a more specialized scalar routine is not necessarily faster.