We investigate the problem of center estimation in the high dimensional binary sub-Gaussian Mixture Model with Hidden Markov structure on the labels. We first study the limitations of existing results in the high dimensional setting and then propose a minimax optimal procedure for the problem of center estimation. Among other findings, we show that our procedure reaches the optimal rate that is of order $\sqrt{\delta d/n} + d/n$ instead of $\sqrt{d/n} + d/n$ where $\delta \in(0,1)$ is a dependence parameter between labels. Along the way, we also develop an adaptive variant of our procedure that is globally minimax optimal. In order to do so, we rely on a more refined and localized analysis of the estimation risk. Overall, leveraging the hidden Markovian dependence between the labels, we show that it is possible to get a strict improvement of the rates adaptively at almost no cost.
翻译:我们研究了具有隐马尔可夫结构标签的高维二元亚高斯混合模型中的中心估计问题。我们首先分析了现有方法在高维设定下的局限性,随后针对中心估计问题提出了一种极小极大最优估计方法。研究发现,我们的方法达到了量级为 $\sqrt{\delta d/n} + d/n$ 的最优收敛速率,而非传统的 $\sqrt{d/n} + d/n$,其中 $\delta \in(0,1)$ 是标签间的依赖参数。在此过程中,我们还开发了该方法的自适应变体,该变体具有全局极小极大最优性。为实现这一目标,我们对估计风险进行了更精细和局部化的分析。总体而言,通过利用标签间的隐马尔可夫依赖性,我们证明了几乎无需额外代价即可自适应地实现收敛速率的严格提升。