A subsequence of a word $w$ is a word $u$ such that $u = w[i_1] w[i_2] \dots w[i_{k}]$, for some set of indices $1 \leq i_1 < i_2 < \dots < i_k \leq \lvert w\rvert$. A word $w$ is $k$-subsequence universal over an alphabet $\Sigma$ if every word in $\Sigma^k$ appears in $w$ as a subsequence. In this paper, we study the intersection between the set of $k$-subsequence universal words over some alphabet $\Sigma$ and regular languages over $\Sigma$. We call a regular language $L$ \emph{$k$-$\exists$-subsequence universal} if there exists a $k$-subsequence universal word in $L$, and \emph{$k$-$\forall$-subsequence universal} if every word of $L$ is $k$-subsequence universal. We give algorithms solving the problems of deciding if a given regular language, represented by a finite automaton recognising it, is \emph{$k$-$\exists$-subsequence universal} and, respectively, if it is \emph{$k$-$\forall$-subsequence universal}, for a given $k$. The algorithms are FPT w.r.t.~the size of the input alphabet, and their run-time does not depend on $k$; they run in polynomial time in the number $n$ of states of the input automaton when the size of the input alphabet is $O(\log n)$. Moreover, we show that the problem of deciding if a given regular language is \emph{$k$-$\exists$-subsequence universal} is NP-complete, when the language is over a large alphabet. Further, we provide algorithms for counting the number of $k$-subsequence universal words (paths) accepted by a given deterministic (respectively, nondeterministic) finite automaton, and ranking an input word (path) within the set of $k$-subsequence universal words accepted by a given finite automaton.
翻译:单词 $w$ 的一个子序列是指存在一组索引 $1 \leq i_1 < i_2 < \dots < i_k \leq \lvert w\rvert$,使得 $u = w[i_1] w[i_2] \dots w[i_{k}]$ 的单词 $u$。若字母表 $\Sigma$ 上的每个单词都在 $w$ 中以子序列形式出现,则称单词 $w$ 是 $\Sigma$ 上的 $k$-子序列完备的。本文研究某字母表 $\Sigma$ 上的 $k$-子序列完备单词集合与 $\Sigma$ 上正则语言之间的交集。若正则语言 $L$ 中存在一个 $k$-子序列完备单词,则称 $L$ 为 $k$-$\exists$-子序列完备;若 $L$ 中每个单词均为 $k$-子序列完备,则称 $L$ 为 $k$-$\forall$-子序列完备。针对给定 $k$,我们提出了判定给定正则语言(由识别它的有限自动机表示)是否为 $k$-$\exists$-子序列完备以及是否为 $k$-$\forall$-子序列完备的算法。该算法关于输入字母表规模具有固定参数可解性(FPT),且运行时间与 $k$ 无关;当输入字母表规模为 $O(\log n)$ 时,算法在输入自动机状态数 $n$ 的多项式时间内运行。此外,我们证明了当正则语言定义在大字母表上时,判定其是否为 $k$-$\exists$-子序列完备是 NP-完全的。进一步,我们提供了计数给定确定型(或非确定型)有限自动机接受的 $k$-子序列完备单词(路径)数量的算法,以及对给定有限自动机接受的 $k$-子序列完备单词集合中的输入单词(路径)进行排序的算法。