The tree edit distance (TED) between two rooted ordered trees with $n$ nodes labeled from an alphabet $\Sigma$ is the minimum cost of transforming one tree into the other by a sequence of valid operations consisting of insertions, deletions and relabeling of nodes. The tree edit distance is a well-known generalization of string edit distance and has been studied since the 1970s. Years of steady improvements have led to an $O(n^3)$ algorithm [DMRW 2010]. Fine-grained complexity casts light onto the hardness of TED showing that a truly subcubic time algorithm for TED implies a truly subcubic time algorithm for All-Pairs Shortest Paths (APSP) [BGMW 2020]. Therefore, under the popular APSP hypothesis, a truly subcubic time algorithm for TED cannot exist. However, unlike many problems in fine-grained complexity for which conditional hardness based on APSP also comes with equivalence to APSP, whether TED can be reduced to APSP has remained unknown. In this paper, we resolve this. Not only we show that TED is fine-grained equivalent to APSP, our reduction is tight enough, so that combined with the fastest APSP algorithm to-date [Williams 2018] it gives the first ever subcubic time algorithm for TED running in $n^3/2^{\Omega(\sqrt{\log{n}})}$ time. We also consider the unweighted tree edit distance problem in which the cost of each edit is one. For unweighted TED, a truly subcubic algorithm is known due to Mao [Mao 2022], later improved slightly by D\"{u}rr [D\"{u}rr 2023] to run in $O(n^{2.9148})$. Their algorithm uses bounded monotone min-plus product as a crucial subroutine, and the best running time for this product is $\tilde{O}(n^{\frac{3+\omega}{2}})\leq O(n^{2.6857})$ (where $\omega$ is the exponent of fast matrix multiplication). In this work, we close this gap and give an algorithm for unweighted TED that runs in $\tilde{O}(n^{\frac{3+\omega}{2}})$ time.
翻译:树编辑距离(TED)是衡量两个带节点标签(标签取自字母表 $\Sigma$)的 $n$ 节点有根有序树之间相似性的度量,定义为通过一系列节点插入、删除和重标签操作将一棵树转换为另一棵树的最小代价。树编辑距离是字符串编辑距离的著名推广,自 1970 年代以来一直被研究。多年的稳步改进已催生出 $O(n^3)$ 算法 [DMRW 2010]。细粒度复杂性理论揭示了 TED 的困难性,表明若存在真正的亚立方时间 TED 算法,则意味着存在真正的亚立方时间全对最短路径(APSP)算法 [BGMW 2020]。因此,在流行的 APSP 假设下,真正的亚立方时间 TED 算法不可能存在。然而,与细粒度复杂性中许多基于 APSP 的条件困难性同时也伴随与 APSP 等价的问题不同,TED 能否归约到 APSP 一直未知。本文中,我们解决了这个问题。我们不仅证明了 TED 与 APSP 是细粒度等价的,而且我们的归约足够紧密,以至于结合当前最快的 APSP 算法 [Williams 2018],首次为 TED 给出了运行时间为 $n^3/2^{\Omega(\sqrt{\log{n}})}$ 的亚立方时间算法。我们还考虑了非加权树编辑距离问题,其中每次编辑操作的代价均为 1。对于非加权 TED,由于 Mao [Mao 2022] 的工作,已知存在真正的亚立方算法,后经 D\"{u}rr [D\"{u}rr 2023] 稍加改进,运行时间达到 $O(n^{2.9148})$。他们的算法使用有界单调最小加积作为关键子程序,而该乘积的最佳运行时间为 $\tilde{O}(n^{\frac{3+\omega}{2}})\leq O(n^{2.6857})$(其中 $\omega$ 是快速矩阵乘法的指数)。在本工作中,我们弥合了这一差距,给出了一个运行时间为 $\tilde{O}(n^{\frac{3+\omega}{2}})$ 的非加权 TED 算法。