In the deletion channel, an important problem is to determine the number of subsequences derived from a string $U$ of length $n$ when subjected to $t$ deletions. It is well-known that the number of subsequences in the setting exhibits a strong dependence on the number of runs in the string $U$, where a run is defined as a maximal substring of identical characters. In this paper we study the number of subsequences of a non-binary string in this scenario, and propose some improved bounds on the number of subsequences of $r$-run non-binary strings. Specifically, we characterize a family of $r$-run non-binary strings with the maximum number of subsequences under any $t$ deletions, and show that this number can be computed in polynomial time.
翻译:在删除信道中,一个重要问题是确定长度为$n$的字符串$U$经历$t$次删除后所得到的子序列数量。众所周知,该场景下子序列的数量对字符串$U$中的游程数量有很强的依赖性,其中游程定义为由相同字符构成的最大子串。本文研究非二元字符串在此情形下的子序列数量问题,并提出了关于$r-游程非二元字符串子序列数量的若干改进上界。具体而言,我们刻画了一族在任意$t$次删除下具有最大子序列数量的$r-游程非二元字符串,并证明该数量可在多项式时间内计算。