The tree edit distance (TED) between two rooted ordered trees with $n$ nodes labeled from an alphabet $\Sigma$ is the minimum cost of transforming one tree into the other by a sequence of valid operations consisting of insertions, deletions and relabeling of nodes. The tree edit distance is a well-known generalization of string edit distance and has been studied since the 1970s. Years of steady improvements have led to an $O(n^3)$ algorithm [DMRW 2010]. Fine-grained complexity casts light onto the hardness of TED showing that a truly subcubic time algorithm for TED implies a truly subcubic time algorithm for All-Pairs Shortest Paths (APSP) [BGMW 2020]. Therefore, under the popular APSP hypothesis, a truly subcubic time algorithm for TED cannot exist. However, unlike many problems in fine-grained complexity for which conditional hardness based on APSP also comes with equivalence to APSP, whether TED can be reduced to APSP has remained unknown. In this paper, we resolve this. Not only we show that TED is fine-grained equivalent to APSP, our reduction is tight enough, so that combined with the fastest APSP algorithm to-date [Williams 2018] it gives the first ever subcubic time algorithm for TED running in $n^3/2^{\Omega(\sqrt{\log{n}})}$ time. We also consider the unweighted tree edit distance problem in which the cost of each edit is one. For unweighted TED, a truly subcubic algorithm is known due to Mao [Mao 2022], later improved slightly by D\"{u}rr [D\"{u}rr 2023] to run in $O(n^{2.9148})$. Their algorithm uses bounded monotone min-plus product as a crucial subroutine, and the best running time for this product is $\tilde{O}(n^{\frac{3+\omega}{2}})\leq O(n^{2.6857})$ (where $\omega$ is the exponent of fast matrix multiplication). In this work, we close this gap and give an algorithm for unweighted TED that runs in $\tilde{O}(n^{\frac{3+\omega}{2}})$ time.
翻译:树编辑距离(TED)是衡量两个带根有序树之间差异的度量,其中每个树具有 $n$ 个节点,节点标签来自字母表 $\Sigma$。其定义为通过一系列有效操作(包括节点的插入、删除和重标记)将一棵树转换为另一棵树的最小代价。树编辑距离是字符串编辑距离的著名推广,自20世纪70年代以来一直被研究。经过多年的稳步改进,目前已得到 $O(n^3)$ 的算法 [DMRW 2010]。细粒度复杂性理论揭示了 TED 的计算难度,表明若存在真正的次立方时间 TED 算法,则意味着存在真正的次立方时间全对最短路径(APSP)算法 [BGMW 2020]。因此,在流行的 APSP 假设下,不可能存在真正的次立方时间 TED 算法。然而,与细粒度复杂性中许多基于 APSP 的条件硬度问题不同,这些问题通常也与 APSP 等价,而 TED 是否能归约到 APSP 一直未知。本文解决了这个问题。我们不仅证明了 TED 在细粒度意义下与 APSP 等价,而且归约足够紧密,结合当前最快的 APSP 算法 [Williams 2018],首次为 TED 提供了运行时间为 $n^3/2^{\Omega(\sqrt{\log{n}})}$ 的次立方时间算法。我们还考虑了非加权树编辑距离问题,其中每次编辑操作的代价均为 1。对于非加权 TED,由于 Mao [Mao 2022] 的工作,已知存在真正的次立方算法,后经 D\"{u}rr [D\"{u}rr 2023] 稍加改进,运行时间达到 $O(n^{2.9148})$。他们的算法使用有界单调最小加积作为关键子程序,而该乘积的最佳运行时间为 $\tilde{O}(n^{\frac{3+\omega}{2}})\leq O(n^{2.6857})$(其中 $\omega$ 是快速矩阵乘法的指数)。在本工作中,我们弥合了这一差距,为非加权 TED 给出了一个运行时间为 $\tilde{O}(n^{\frac{3+\omega}{2}})$ 的算法。