Given an increasing sequence of integers $x_1,\ldots,x_n$ from a universe $\{0,\ldots,u-1\}$, the monotone minimal perfect hash function (MMPHF) for this sequence is a data structure that answers the following rank queries: $rank(x) = i$ if $x = x_i$, for $i\in \{1,\ldots,n\}$, and $rank(x)$ is arbitrary otherwise. Assadi, Farach-Colton, and Kuszmaul recently presented at SODA'23 a proof of the lower bound $\Omega(n \min\{\log\log\log u, \log n\})$ for the bits of space required by MMPHF, provided $u \ge n 2^{2^{\sqrt{\log\log n}}}$, which is tight since there is a data structure for MMPHF that attains this space bound (and answers the queries in $O(\log u)$ time). In this paper, we close the remaining gap by proving that, for $u \ge (1+\epsilon)n$, where $\epsilon > 0$ is any constant, the tight lower bound is $\Omega(n \min\{\log\log\log \frac{u}{n}, \log n\})$, which is also attainable; we observe that, for all reasonable cases when $n < u < (1+\epsilon)n$, known facts imply tight bounds, which virtually settles the problem. Along the way we substantially simplify the proof of Assadi et al. replacing a part of their heavy combinatorial machinery by trivial observations. However, an important part of the proof still remains complicated. This part of our paper repeats arguments of Assadi et al. and is not novel. Nevertheless, we include it, for completeness, offering a somewhat different perspective on these arguments.
翻译:给定来自全集{0,...,u-1}的递增整数序列x1,...,xn,该序列的单调最小完美哈希函数(MMPHF)是一种数据结构,用于回答以下秩查询:若x=xi(i∈{1,...,n}),则rank(x)=i,否则rank(x)取任意值。Assadi、Farach-Colton和Kuszmaul最近在SODA'23上提出了MMPHF所需空间比特的下界Ω(n min{log log log u, log n})的证明,前提是u ≥ n·2^{2^{√(log log n)}},该下界是紧的,因为存在达到此空间界(并以O(log u)时间回答查询)的MMPHF数据结构。在本文中,我们通过证明对于u ≥ (1+ε)n(其中ε>0为任意常数),紧下界为Ω(n min{log log log(u/n), log n})且该界可达到,填补了剩余间隙;我们观察到,当n < u < (1+ε)n时,所有合理情况下已知事实即蕴含紧界,这实际上解决了该问题。在此过程中,我们大幅简化了Assadi等人的证明,将其部分复杂的组合机制替换为平凡观察。然而,证明的重要部分仍保持复杂性。本文的这一部分重复了Assadi等人的论证,并非原创。但为完整性起见,我们仍将其纳入,并对其提供了略有不同的视角。