We study the problem of sequentially testing whether a given stochastic process is generated by a known Markov chain. Formally, given access to a stream of random variables, we want to quickly determine whether this sequence is a trajectory of a Markov chain with a known transition matrix $P$ (null hypothesis) or not (composite alternative hypothesis). This problem naturally arises in many engineering problems. The main technical challenge is to develop a sequential testing scheme that adapts its sample size to the unknown alternative. Indeed, if we knew the alternative distribution (that is, the transition matrix) $Q$, a natural approach would be to use a generalization of Wald's sequential probability ratio test (SPRT). Building on this intuition, we propose and analyze a family of one-sided SPRT-type tests for our problem that use a data-driven estimator $\hat{Q}$. In particular, we show that if the deployed estimator admits a worst-case regret guarantee scaling as $\mathcal{O}\left( \log{t} \right)$, then the performance of our test asymptotically matches that of SPRT in the simple hypothesis testing case. In other words, our test automatically adapts to the unknown hardness of the problem, without any prior information. We end with a discussion of known Markov chain estimators with $\mathcal{O}\left( \log{t} \right)$ regret.
翻译:我们研究序贯检验一个给定随机过程是否由已知马尔可夫链生成的问题。具体而言,给定对随机变量序列的访问权限,我们希望快速判断该序列是否为一个具有已知转移矩阵 $P$ 的马尔可夫链的轨迹(零假设),抑或不是(复合备择假设)。该问题自然出现在众多工程问题中。主要技术挑战在于开发一种能根据未知备择假设自适应调整样本量的序贯检验方案。事实上,若我们已知备择分布(即转移矩阵)$Q$,一种自然方法是使用沃尔德序贯概率比检验(SPRT)的推广形式。基于这一思路,我们针对该问题提出并分析了一类使用数据驱动估计量 $\hat{Q}$ 的单边SPRT型检验。特别地,我们证明:若所采用的估计量具有按 $\mathcal{O}\left( \log{t} \right)$ 缩放的最坏情况遗憾保证,则我们检验的性能在渐近意义上与简单假设检验情形下的SPRT相匹配。换言之,我们的检验能在没有任何先验信息的情况下,自动适应问题的未知难度。最后我们讨论了具有 $\mathcal{O}\left( \log{t} \right)$ 遗憾度的已知马尔可夫链估计量。