We consider an online learning problem in environments with multiple change points. In contrast to the single change point problem that is widely studied using classical "high confidence" detection schemes, the multiple change point environment presents new learning-theoretic and algorithmic challenges. Specifically, we show that classical methods may exhibit catastrophic failure (high regret) due to a phenomenon we refer to as endogenous confounding. To overcome this, we propose a new class of learning algorithms dubbed Anytime Tracking CUSUM (ATC). These are horizon-free online algorithms that implement a selective detection principle, balancing the need to ignore "small" (hard-to-detect) shifts, while reacting "quickly" to significant ones. We prove that the performance of a properly tuned ATC algorithm is nearly minimax-optimal; its regret is guaranteed to closely match a novel information-theoretic lower bound on the achievable performance of any learning algorithm in the multiple change point problem. Experiments on synthetic as well as real-world data validate the aforementioned theoretical findings.
翻译:我们考虑存在多个变化点的在线学习问题。与广泛采用经典“高置信度”检测方案研究的单一变化点问题不同,多变化点环境带来了新的学习理论与算法挑战。具体而言,我们证明经典方法可能因一种称为“内生混淆”的现象而出现灾难性失效(高遗憾值)。为解决此问题,我们提出一类新的学习算法,命名为“任意时刻跟踪CUSUM”(ATC)。这些算法是无时间视界的在线算法,实现了一种选择性检测原则,平衡了忽略“微小”(难以检测)变化的必要性与快速响应显著变化的需求。我们证明,经过恰当调参的ATC算法的性能近乎最优极小化;其遗憾值被严密保证契合多变化点问题中任何学习算法可达到性能的新颖信息论下界。在合成数据与真实数据上的实验验证了上述理论发现。