In the deletion channel, an important problem is to determine the number of subsequences derived from a string $U$ of length $n$ when subjected to $t$ deletions. It is well-known that the number of subsequences in the setting exhibits a strong dependence on the number of runs in the string $U$, where a run is defined as a maximal substring of identical characters. In this paper we study the number of subsequences of a non-binary string in this scenario, and propose some improved bounds on the number of subsequences of $r$-run non-binary strings. Specifically, we characterize a family of $r$-run non-binary strings with the maximum number of subsequences under any $t$ deletions, and show that this number can be computed in polynomial time.
翻译:在删除信道中,一个关键问题是确定长度为 $n$ 的字符串 $U$ 经过 $t$ 次删除后所能产生的子序列数量。众所周知,该场景下的子序列数量与字符串 $U$ 的游程数密切相关,其中游程定义为由相同字符构成的最大连续子串。本文研究了非二进制字符串在此情境下的子序列数量,并对具有 $r$ 个游程的非二进制字符串的子序列数量提出了若干改进的界。具体而言,我们刻画了一类在任意 $t$ 次删除下具有最大子序列数量的 $r$ 游程非二进制字符串族,并证明该数量可在多项式时间内计算得出。