The block tree [Belazzougui et al., J. Comput. Syst. Sci. '21] is a compressed representation of a length-$n$ text that supports access, rank, and select queries while requiring only $O(z\log\frac{n}{z})$ words of space, where $z$ is the number of Lempel-Ziv factors of the text. In other words, its space-requirements are asymptotically similar to those of the compressed text. In practice, block trees offer comparable query performance to state-of-the-art compressed rank and select indices. However, their construction is significantly slower. Additionally, the fastest construction algorithms require a significant amount of working memory. To address this issue, we propose fast and lightweight parallel algorithms for the efficient construction of block trees. Our algorithm achieves similar speed than the currently fastest construction algorithm on one core and is up to four times faster using 64 cores. It achieves all that while requiring an order of magnitude less memory. As result of independent interest, we present a data parallel algorithm for Karp-Rabin fingerprint computation.
翻译:块树[Belazzougui等人,J. Comput. Syst. Sci. '21]是一种长度为$n$的文本压缩表示,支持访问、秩和选择查询,仅需$O(z\log\frac{n}{z})$字空间,其中$z$是文本的Lempel-Ziv因子数。换言之,其空间需求在渐近意义上与压缩文本相当。在实际应用中,块树的查询性能与最先进的压缩秩和选择索引相当。然而,其构建速度明显较慢。此外,最快的构建算法需要大量工作内存。为解决此问题,我们提出了快速轻量的并行算法以实现块树的高效构建。我们的算法在单核上达到与当前最快构建算法相近的速度,使用64核时最高可提速四倍。所有这些成果仅需低一个数量级的内存即可实现。作为具有独立价值的研究成果,我们提出了一种用于Karp-Rabin指纹计算的数据并行算法。