We consider an online learning problem in environments with multiple change points. In contrast to the single change point problem that is widely studied using classical "high confidence" detection schemes, the multiple change point environment presents new learning-theoretic and algorithmic challenges. Specifically, we show that classical methods may exhibit catastrophic failure (high regret) due to a phenomenon we refer to as endogenous confounding. To overcome this, we propose a new class of learning algorithms dubbed Anytime Tracking CUSUM (ATC). These are horizon-free online algorithms that implement a selective detection principle, balancing the need to ignore "small" (hard-to-detect) shifts, while reacting "quickly" to significant ones. We prove that the performance of a properly tuned ATC algorithm is nearly minimax-optimal; its regret is guaranteed to closely match a novel information-theoretic lower bound on the achievable performance of any learning algorithm in the multiple change point problem. Experiments on synthetic as well as real-world data validate the aforementioned theoretical findings.
翻译:我们研究了存在多重变点的环境中的在线学习问题。与广泛采用经典"高置信度"检测方案研究的单变点问题不同,多重变点环境带来了新的学习理论与算法挑战。具体而言,我们证明经典方法可能因我们称之为内生混淆的现象而出现灾难性失效(高遗憾)。为克服此问题,我们提出了一类新的学习算法——随时跟踪累积和算法。这类无时间界限的在线算法实现了选择性检测原则,在忽略"微小"(难以检测)漂移与"快速"响应显著变化之间取得平衡。我们证明,经适当调参的随时跟踪累积和算法具有近乎极小极大最优的性能;其遗憾保证能紧密匹配我们提出的、关于多重变点问题中任何学习算法可达性能的信息论下界。在合成数据与真实数据上的实验验证了上述理论发现。