In this paper, we consider two versions of the Text Assembling problem. We are given a sequence of strings $s^1,\dots,s^n$ of total length $L$ that is a dictionary, and a string $t$ of length $m$ that is texts. The first version of the problem is assembling $t$ from the dictionary. The second version is the ``Shortest Superstring Problem''(SSP) or the ``Shortest Common Superstring Problem''(SCS). In this case, $t$ is not given, and we should construct the shortest string (we call it superstring) that contains each string from the given sequence as a substring. These problems are connected with the sequence assembly method for reconstructing a long DNA sequence from small fragments. For both problems, we suggest new quantum algorithms that work better than their classical counterparts. In the first case, we present a quantum algorithm with $O(m+\log m\sqrt{nL})$ running time. In the case of SSP, we present a quantum algorithm with running time $O(n^3 1.728^n +L +\sqrt{L}n^{1.5}+\sqrt{L}n\log^2L\log^2n)$.
翻译:本文研究了文本组装问题的两个版本。给定一个由总长度为$L$的字符串序列$s^1,\dots,s^n$构成的词典,以及一个长度为$m$的文本字符串$t$。第一个版本的问题是利用该词典组装出文本$t$。第二个版本是“最短超串问题”(SSP)或“最短公共超串问题”(SCS)。在此情况下,未给定$t$,需要构造包含给定序列中每个字符串作为子串的最短字符串(称为超串)。这些问题与通过小片段重建长DNA序列的序列组装方法相关。针对这两个问题,我们提出了性能优于经典算法的新型量子算法。对于第一个问题,我们给出时间复杂度为$O(m+\log m\sqrt{nL})$的量子算法。对于SSP问题,我们提出的量子算法时间复杂度为$O(n^3 1.728^n +L +\sqrt{L}n^{1.5}+\sqrt{L}n\log^2L\log^2n)$。