Unpacking Parquet: Where Explicit SIMD Actually Matters

Unpacking Parquet: Where Explicit SIMD Actually Matters

On the JVM, optimizing a hot kernel is not only about writing faster code: it is about controlling how much the result depends on the compiler recognizing the code’s shape. Using Parquet bit-unpacking as a concrete case, our experiment shows that a SIMD speedup depends on which scalar baseline C2 is handed, when explicit vectorization is actually justified, and why a more specialized scalar routine is not necessarily faster.

June 9, 2026 · 22 min