We study one-sided and $α$-correct sequential hypothesis testing for data generated by an ergodic Markov chain. The null hypothesis is that the unknown transition matrix belongs to a prescribed set $P$ of stochastic matrices, and the alternative corresponds to a disjoint set $Q$. We establish a tight non-asymptotic instance-dependent lower bound on the expected stopping time of any valid sequential test under the alternative. Our novel analysis improves the existing lower bounds, which are either asymptotic or provably sub-optimal in this setting. Our lower bound incorporates both the stationary distribution and the transition structure induced by the unknown Markov chain. We further propose an optimal test whose expected stopping time matches this lower bound asymptotically as $α\to 0$. We illustrate the usefulness of our framework through applications to sequential detection of model misspecification in Markov Chain Monte Carlo and to testing structural properties, such as the linearity of transition dynamics, in Markov decision processes. Our findings yield a sharp and general characterization of optimal sequential testing procedures under Markovian dependence.
翻译:我们研究了由遍历马尔可夫链生成数据的单边与$α$正确序贯假设检验问题。零假设为未知转移矩阵属于一个给定的随机矩阵集合$P$,备择假设则对应一个与之不相交的集合$Q$。我们建立了一个紧的非渐近、实例相关的下界,该下界适用于备择假设下任何有效序贯检验的期望停止时间。我们新颖的分析改进了现有下界,后者在此设定下要么是渐近的,要么被证明是次优的。我们的下界同时包含了由未知马尔可夫链诱导的平稳分布与转移结构。我们进一步提出了一种最优检验,其期望停止时间在$α\to 0$时渐近匹配该下界。我们通过两个应用展示了所提框架的实用性:马尔可夫链蒙特卡洛中模型误设的序贯检测,以及马尔可夫决策过程中转移动态线性等结构性质的检验。我们的研究结果为马尔可夫相依性下的最优序贯检验程序提供了一个精确且普适的表征。