We revisit the Heaviest Induced Ancestors (HIA) problem that was introduced by Gagie, Gawrychowski, and Nekrich [CCCG 2013] and has a number of applications in string algorithms. Let $T_1$ and $T_2$ be two rooted trees whose nodes have weights that are increasing in all root-to-leaf paths, and labels on the leaves, such that no two leaves of a tree have the same label. A pair of nodes $(u, v)\in T_1 \times T_2$ is \emph{induced} if and only if there is a label shared by leaf-descendants of $u$ and $v$. In an HIA query, given nodes $x \in T_1$ and $y \in T_2$, the goal is to find an induced pair of nodes $(u, v)$ of the maximum total weight such that $u$ is an ancestor of~$x$ and $v$ is an ancestor of $y$. Let $n$ be the upper bound on the sizes of the two trees. It is known that no data structure of size $\tilde{\mathcal{O}}(n)$ can answer HIA queries in $o(\log n / \log \log n)$ time [Charalampopoulos, Gawrychowski, Pokorski; ICALP 2020]. This (unconditional) lower bound is a $\operatorname{polyloglog} n$ factor away from the query time of the fastest $\tilde{\mathcal{O}}(n)$-size data structure known to date for the HIA problem [Abedin, Hooshmand, Ganguly, Thankachan; Algorithmica 2022]. In this work, we resolve the query-time complexity of the HIA problem for the near-linear space regime by presenting a data structure that can be built in $\tilde{\mathcal{O}}(n)$ time and answers HIA queries in $\mathcal{O}(\log n/\log\log n)$ time. As a direct corollary, we obtain an $\tilde{\mathcal{O}}(n)$-size data structure that maintains the LCS of a static string and a dynamic string, both of length at most $n$, in time optimal for this space regime. The main ingredients of our approach are fractional cascading and the utilization of an $\mathcal{O}(\log n/ \log\log n)$-depth tree decomposition.
翻译:我们重新审视由Gagie、Gawrychowski和Nekrich [CCCG 2013]提出的最重诱导祖先(HIA)问题,该问题在字符串算法中具有多项应用。设$T_1$和$T_2$为两棵有根树,其节点的权重沿所有根到叶路径递增,且叶子上带有标签,使得同一棵树中没有两片叶子具有相同标签。一对节点$(u, v) \in T_1 \times T_2$被称为“诱导”的,当且仅当存在一个标签同时被$u$和$v$的叶后代所共享。在HIA查询中,给定节点$x \in T_1$和$y \in T_2$,目标是找到总权重最大的诱导节点对$(u, v)$,使得$u$是$x$的祖先,$v$是$y$的祖先。设$n$为两棵树大小的上界。已知大小为$\tilde{\mathcal{O}}(n)$的数据结构无法在$o(\log n / \log \log n)$时间内回答HIA查询 [Charalampopoulos, Gawrychowski, Pokorski; ICALP 2020]。这一(无条件)下界与目前已知最快的$\tilde{\mathcal{O}}(n)$大小数据结构 [Abedin, Hooshmand, Ganguly, Thankachan; Algorithmica 2022] 的查询时间相差一个$\operatorname{polyloglog} n$因子。在本工作中,我们通过提出一种数据结构的构建时间为$\tilde{\mathcal{O}}(n)$、回答HIA查询的时间为$\mathcal{O}(\log n / \log \log n)$,解决了近线性空间范围内HIA问题的查询时间复杂度。作为直接推论,我们获得了一个大小为$\tilde{\mathcal{O}}(n)$的数据结构,该结构能以该空间范围下的最优时间维护长度至多为$n$的静态字符串与动态字符串的最长公共子序列(LCS)。我们方法的主要组成部分是分数级联以及利用深度为$\mathcal{O}(\log n / \log \log n)$的树分解。