Given an increasing sequence of integers $x_1,\ldots,x_n$ from a universe $\{0,\ldots,u-1\}$, the monotone minimal perfect hash function (MMPHF) for this sequence is a data structure that answers the following rank queries: $rank(x) = i$ if $x = x_i$, for $i\in \{1,\ldots,n\}$, and $rank(x)$ is arbitrary otherwise. Assadi, Farach-Colton, and Kuszmaul recently presented at SODA'23 a proof of the lower bound $\Omega(n \min\{\log\log\log u, \log n\})$ for the bits of space required by MMPHF, provided $u \ge n 2^{2^{\sqrt{\log\log n}}}$, which is tight since there is a data structure for MMPHF that attains this space bound (and answers the queries in $O(\log u)$ time). In this paper, we close the remaining gap by proving that, for $u \ge (1+\epsilon)n$, where $\epsilon > 0$ is any constant, the tight lower bound is $\Omega(n \min\{\log\log\log \frac{u}{n}, \log n\})$, which is also attainable; we observe that, for all reasonable cases when $n < u < (1+\epsilon)n$, known facts imply tight bounds, which virtually settles the problem. Along the way we substantially simplify the proof of Assadi et al. replacing a part of their heavy combinatorial machinery by trivial observations. However, an important part of the proof still remains complicated. This part of our paper repeats arguments of Assadi et al. and is not novel. Nevertheless, we include it, for completeness, offering a somewhat different perspective on these arguments.
翻译:给定来自全集 $\{0,\ldots,u-1\}$ 的递增整数序列 $x_1,\ldots,x_n$,该序列的单调最小完美哈希函数(MMPHF)是一种数据结构,用于回答以下秩查询:若 $x = x_i$($i\in \{1,\ldots,n\}$),则 $rank(x) = i$,否则 $rank(x)$ 为任意值。Assadi、Farach-Colton 和 Kuszmaul 近期在 SODA'23 上给出了 MMPHF 所需空间比特的下界证明 $\Omega(n \min\{\log\log\log u, \log n\})$,前提是 $u \ge n 2^{2^{\sqrt{\log\log n}}}$,该下界是紧的,因为存在一种 MMPHF 数据结构能达到该空间界(且查询时间为 $O(\log u)$)。本文通过证明对于 $u \ge (1+\epsilon)n$(其中 $\epsilon > 0$ 为任意常数),紧下界为 $\Omega(n \min\{\log\log\log \frac{u}{n}, \log n\})$(且该界亦可达到),填补了剩余间隙;我们注意到,对于 $n < u < (1+\epsilon)n$ 的所有合理情况,已知事实已蕴含紧界,这实际上解决了该问题。在方法上,我们大幅简化了 Assadi 等人的证明,将其部分复杂的组合工具替换为平凡观测。然而,证明中仍有重要部分保持复杂。本文该部分复述了 Assadi 等人的论证,并非创新,但为完备性起见我们予以保留,并提供了对这些论证的略有不同的视角。