In Linear Hashing ($\mathsf{LH}$) with $\beta$ bins on a size $u$ universe ${\mathcal{U}=\{0,1,\ldots, u-1\}}$, items $\{x_1,x_2,\ldots, x_n\}\subset \mathcal{U}$ are placed in bins by the hash function $$x_i\mapsto (ax_i+b)\mod p \mod \beta$$ for some prime $p\in [u,2u]$ and randomly chosen integers $a,b \in [1,p]$. The "maxload" of $\mathsf{LH}$ is the number of items assigned to the fullest bin. Expected maxload for a worst-case set of items is a natural measure of how well $\mathsf{LH}$ distributes items amongst the bins. Fix $\beta=n$. Despite $\mathsf{LH}$'s simplicity, bounding $\mathsf{LH}$'s worst-case maxload is extremely challenging. It is well-known that on random inputs $\mathsf{LH}$ achieves maxload $\Omega\left(\frac{\log n}{\log\log n}\right)$; this is currently the best lower bound for $\mathsf{LH}$'s expected maxload. Recently Knudsen established an upper bound of $\widetilde{O}(n^{1 / 3})$. The question "Is the worst-case expected maxload of $\mathsf{LH}$ $n^{o(1)}$?" is one of the most basic open problems in discrete math. In this paper we propose a set of intermediate open questions to help researchers make progress on this problem. We establish the relationship between these intermediate open questions and make some partial progress on them.
翻译:在线性哈希($\mathsf{LH}$)中,对于大小为$u$的全域${\mathcal{U}=\{0,1,\ldots, u-1\}}$,设有$\beta$个桶。对于项$\{x_1,x_2,\ldots, x_n\}\subset \mathcal{U}$,通过哈希函数$$x_i\mapsto (ax_i+b)\mod p \mod \beta$$将其放入桶中,其中$p\in [u,2u]$为某素数,$a,b \in [1,p]$为随机选择的整数。$\mathsf{LH}$的"最大负载"指被分配到最满桶中的项数。最坏情况项集下的期望最大负载是衡量$\mathsf{LH}$在桶间分布项的性能的自然指标。令$\beta=n$。尽管$\mathsf{LH}$结构简单,但界定其最坏情况下的最大负载极具挑战。已知在随机输入下,$\mathsf{LH}$可实现最大负载$\Omega\left(\frac{\log n}{\log\log n}\right)$;这是当前$\mathsf{LH}$期望最大负载的最佳下界。最近,Knudsen建立了$\widetilde{O}(n^{1 / 3})$的上界。"$\mathsf{LH}$的最坏情况期望最大负载是否为$n^{o(1)}$?"这一问题已成为离散数学中最基本的未解问题之一。本文提出一组中间开放问题,以助力研究者在该问题上取得进展。我们建立了这些中间开放问题之间的关系,并对其进行了部分推进。