A Note on Second-Order Expected Maximum-Load Bounds for Binary Linear Hashing

Let $S\subseteq F_2^u$ have size $n=2^\ell$, and let $h:F_2^u\to F_2^\ell$ be a uniformly random linear map. For $y\in F_2^\ell$, write $Load_h(y):=|h^{-1}(y)\cap S|$, and let $M(S,h):=\max_{y\in F_2^\ell} Load_h(y)$ be the maximum load. Jaber, Kumar and Zuckerman (STOC 2025) proved that the expected maximum load of $h$ on $S$ is at most $16\log n/\log\log n$, matching the fully independent keys-into-bins scale up to constants. Their proof also gives the tail estimate \[ \Pr\left[ M(S,h)\ge R\frac{\log n}{\log\log n} \right] \le O\left(\frac{1}{R^{2}}\right). \] We record a base optimization in their exponential-potential method showing that binary linear hashing nearly matches fully independent hashing also at the level of the second-order maximum-load scale. For every $R>1$ satisfying $R\ell^{1-1/R}\ge D\ln\ell$, where $D$ is an absolute constant, we prove \[ \Pr\left[ M(S,h)\ge R\frac{\log n}{\log\log n} \right] \le O\left( \frac{(\log\log n)^2}{R^2(\log n)^{2-2/R}} \right). \] Integrating this tail yields \[ E[M(S,h)] \le \left( 1+ (1+o(1)) \frac{\log\log\log n}{\log\log n} \right) \frac{\log n}{\log\log n}. \] Thus binary linear hashing matches fully independent hashing in the leading term and matches the dominant second-order correction up to a $1+o(1)$ factor. We also prove, by an independent self-contained argument, a sharp tail bound for one prescribed bucket: for fixed $y\in F_2^\ell$, \[ \Pr[ Load_h(y)>2^a-2]\le γ^{-1}2^{-a^2}, \] where $ γ=\prod_{j\ge1}(1-2^{-j}) $. A subspace construction shows that this is asymptotically tight even in the leading constant as $ a\to\infty $. However, this controls only a fixed bucket; a direct union bound over all buckets loses a factor $ 2^\ell $.

翻译：设 $S\subseteq F_2^u$ 满足 $n=2^\ell$，$h:F_2^u\to F_2^\ell$ 为均匀随机线性映射。对 $y\in F_2^\ell$，记 $Load_h(y):=|h^{-1}(y)\cap S|$，并令 $M(S,h):=\max_{y\in F_2^\ell} Load_h(y)$ 为最大负载。Jaber、Kumar 和 Zuckerman (STOC 2025) 证明了 $h$ 在 $S$ 上的期望最大负载至多为 $16\log n/\log\log n$，在常数因子内与独立键入桶模型相匹配。其证明还给出了尾部估计 \[ \Pr\left[ M(S,h)\ge R\frac{\log n}{\log\log n} \right] \le O\left(\frac{1}{R^{2}}\right). \] 我们记录其指数势方法中的一种基础优化，表明二元线性哈希在二阶最大负载尺度上也几乎匹配独立哈希。对每个满足 $R\ell^{1-1/R}\ge D\ln\ell$ 的 $R>1$，其中 $D$ 为绝对常数，我们证明 \[ \Pr\left[ M(S,h)\ge R\frac{\log n}{\log\log n} \right] \le O\left( \frac{(\log\log n)^2}{R^2(\log n)^{2-2/R}} \right). \] 对该尾部进行积分可得 \[ E[M(S,h)] \le \left( 1+ (1+o(1)) \frac{\log\log\log n}{\log\log n} \right) \frac{\log n}{\log\log n}. \] 因此二元线性哈希在主项上匹配独立哈希，并在主导二阶修正项上匹配至 $1+o(1)$ 因子。我们还通过独立自洽论证证明了一个关于指定桶的锐化尾部界：对固定 $y\in F_2^\ell$，有 \[ \Pr[ Load_h(y)>2^a-2]\le γ^{-1}2^{-a^2}, \] 其中 $ γ=\prod_{j\ge1}(1-2^{-j}) $。子空间构造表明，当 $a\to\infty$ 时该界即使在主常数上也渐近最优。然而这仅控制固定桶；对所有桶的直接联合界会损失因子 $2^\ell$。