Palindromes are strings that read the same forward and backward. The computation of palindromic structures within strings is a fundamental problem in string algorithms, being motivated by potential applications in formal language theory and bioinformatics. Although the number of palindromic factors in a string of length $n$ can be quadratic, they can be implicitly represented in $O(n \log n)$ bits of space by storing the lengths of all maximal palindromes in an integer array, which can be computed in $O(n)$ time [Manacher, 1975]. In this paper, for any positive constant $ε< 1$, we propose a novel $(3(1+ε)n + o(n))$-bit representation of all maximal palindromes in a string, which enables $O(1)$-time retrieval of the length of the maximal palindrome centered at any given position. The data structure can be constructed in $O(n)$ time from the input string of length $n$. Since Manacher's algorithm and the notion of maximal palindromes are widely utilized for solving numerous problems involving palindromic structures, our compact representation will accelerate the development of more space-efficient solutions to such problems. Indeed, as the first application of our compact representation of maximal palindromes, we present a data structure of size $O(n)$ bits that can compute the longest palindrome appearing in any given factor of a string of length $n$ in $O(\log n)$ time.
翻译:回文子是正读反读均相同的字符串。计算字符串中的回文结构是字符串算法中的基本问题,其动机源于形式语言理论和生物信息学中的潜在应用。尽管长度为$n$的字符串中的回文因子数量可达平方级,但通过存储所有最大回文子串的长度至整数数组,可在$O(n \log n)$位空间内隐式表示它们,且该数组可在$O(n)$时间内计算得到[Manacher, 1975]。本文针对任意正常数$ε< 1$,提出一种新颖的$(3(1+ε)n + o(n))$位表示法,可表示字符串中所有最大回文子串,并支持$O(1)$时间检索以任意给定位置为中心的最大回文子串长度。该数据结构可在$O(n)$时间内从长度为$n$的输入字符串构建。由于Manacher算法及最大回文子串概念被广泛用于解决涉及回文结构的多种问题,我们提出的紧凑表示将加速开发此类问题的更高空间效率解决方案。实际上,作为最大回文子串紧凑表示的首个应用,我们提出一种大小为$O(n)$位的数据结构,可在$O(\log n)$时间内计算给定字符串任意因子中出现的最长回文子串。