This paper investigates non-stationary online learning using the metric of interval regret, which requires an online algorithm to perform well over every time interval. We propose the first online learning algorithm that achieves an interval regret bound scaling with gradient variation, a fundamental measure of the cumulative change in online function gradients, which relates to various problem-dependent quantities and is closely connected to stochastic optimization and other problems. Our method employs a simple and efficient two-layer online ensemble structure that achieves strong theoretical guarantees. Specifically, it enjoys a regret bound that simultaneously adapts to various problem-dependent quantities while also preserving the minimax-optimal rate in the worst case. Moreover, recognizing the challenge of hyperparameter tuning, we introduce a Lipschitz- and smoothness-agnostic variant that automatically adapts to these potentially unknown constants. This is primarily enabled by a novel Lipschitz-adaptive meta algorithm, which may be of independent interest. Beyond interval regret, our method also yields broader implications: it provides versatile bounds for interval dynamic regret, a stronger measure that competes with changing comparators over any interval, and yields the first piecewise characterization for stochastic extended adversarial optimization. Theoretical findings are validated by experiments.
翻译:摘要:本文研究采用区间遗憾度量的非平稳在线学习,要求在线算法在每个时间区间内均表现良好。我们提出了首个实现梯度变化缩放区间遗憾界的在线学习算法——梯度变化是衡量在线函数梯度累积变化的基本指标,它关联多种问题相关量,且与随机优化等问题密切相关。该方法采用简单高效的双层在线集成结构,具备强理论保证。具体而言,其遗憾界能同时自适应多种问题相关量,同时在最坏情况下保持极小化最优速率。此外,针对超参数调优的挑战,我们引入了一种不依赖Lipschitz常数与光滑性假设的变体,可自动适应这些潜在未知常数。这一突破主要归功于一种新颖的Lipschitz自适应元算法(可能具有独立研究价值)。除区间遗憾外,该方法还产生更广泛影响:它为区间动态遗憾(一种在任意区间内与动态比较器竞争更强指标)提供了通用界,并首次给出了随机扩展对抗优化的分段刻画。理论结果通过实验得到验证。