Classically, the edit distance of two length-$n$ strings can be computed in $O(n^2)$ time, whereas an $O(n^{2-\epsilon})$-time procedure would falsify the Orthogonal Vectors Hypothesis. If the edit distance does not exceed $k$, the running time can be improved to $O(n+k^2)$, which is near-optimal (conditioned on OVH) as a function of $n$ and $k$. Our first main contribution is a quantum $\tilde{O}(\sqrt{nk}+k^2)$-time algorithm that uses $\tilde{O}(\sqrt{nk})$ queries, where $\tilde{O}(\cdot)$ hides polylogarithmic factors. This query complexity is unconditionally optimal, and any significant improvement in the time complexity would resolve a long-standing open question of whether edit distance admits an $O(n^{2-\epsilon})$-time quantum algorithm. Our divide-and-conquer quantum algorithm reduces the edit distance problem to a case where the strings have small Lempel-Ziv factorizations. Then, it combines a quantum LZ compression algorithm with a classical edit-distance subroutine for compressed strings. The LZ factorization problem can be classically solved in $O(n)$ time, which is unconditionally optimal in the quantum setting. We can, however, hope for a quantum speedup if we parameterize the complexity in terms of the factorization size $z$. Already a generic oracle identification algorithm yields the optimal query complexity of $\tilde{O}(\sqrt{nz})$ at the price of exponential running time. Our second main contribution is a quantum algorithm that achieves the optimal time complexity of $\tilde{O}(\sqrt{nz})$. The key tool is a novel LZ-like factorization of size $O(z\log^2n)$ whose subsequent factors can be efficiently computed through a combination of classical and quantum techniques. We can then obtain the string's run-length encoded Burrows-Wheeler Transform (BWT), construct the $r$-index, and solve many fundamental string processing problems in time $\tilde{O}(\sqrt{nz})$.
翻译:经典算法中,两个长度为$n$的字符串的编辑距离可在$O(n^2)$时间内计算,而若存在$O(n^{2-\epsilon})$时间的算法将证伪正交向量假设。当编辑距离不超过$k$时,运行时间可改进至$O(n+k^2)$,这一复杂度作为$n$和$k$的函数(基于OVH假设)已达到近优水平。本文的首要贡献是提出一种量子算法,其时间复杂度为$\tilde{O}(\sqrt{nk}+k^2)$,查询复杂度为$\tilde{O}(\sqrt{nk})$(其中$\tilde{O}(\cdot)$隐藏多项对数因子)。该查询复杂度已达到无条件最优,任何对时间复杂度的显著改进都将解决编辑距离是否存在$O(n^{2-\epsilon})$时间量子算法这一长期悬而未决的问题。我们的分治量子算法将编辑距离问题归约至字符串具有较小Lempel-Ziv因式分解的情形,继而结合量子LZ压缩算法与针对压缩字符串的经典编辑距离子程序。经典算法可在$O(n)$时间内求解LZ因式分解问题,这在量子场景下已达到无条件最优。然而,若以因式分解规模$z$参数化复杂度,我们有望实现量子加速。通用预言识别算法虽能以指数级运行时间为代价达到$\tilde{O}(\sqrt{nz})$的最优查询复杂度,但本文的第二项贡献是实现最优时间复杂度$\tilde{O}(\sqrt{nz})$的量子算法。其关键工具是一种规模为$O(z\log^2n)$的新型类LZ因式分解,后续因子可通过经典与量子技术的结合高效计算。据此可获取字符串的游程编码Burrows-Wheeler变换(BWT),构建$r$索引,并在$\tilde{O}(\sqrt{nz})$时间内解决众多基础字符串处理问题。