From Dictionary Indices to Integers: Parquet Bit Unpacking Explained
Parquet rarely stores a column’s values directly. Dictionary encoding keeps the distinct values once and represents the column through small bit-packed integer indices. This explainer walks through dictionary encoding, how the indices are bit-packed, and the bit-unpacking step that reconstructs them into an int[] before every dictionary lookup.