We extend and combine several tools of the literature to design fast, adaptive, anytime and scale-free online learning algorithms. Scale-free regret bounds must scale linearly with the maximum loss, both toward large losses and toward very small losses. Adaptive regret bounds demonstrate that an algorithm can take advantage of easy data and potentially have constant regret. We seek to develop fast algorithms that depend on as few parameters as possible, in particular they should be anytime and thus not depend on the time horizon. Our first and main tool, isotuning, is a generalization of the idea of balancing the trade-off of the regret. We develop a set of tools to design and analyze such learning rates easily and show that they adapts automatically to the rate of the regret (whether constant, $O(\log T)$, $O(\sqrt{T})$, etc.) within a factor 2 of the optimal learning rate in hindsight for the same observed quantities. The second tool is an online correction, which allows us to obtain centered bounds for many algorithms, to prevent the regret bounds from being vacuous when the domain is overly large or only partially constrained. The last tool, null updates, prevents the algorithm from performing overly large updates, which could result in unbounded regret, or even invalid updates. We develop a general theory using these tools and apply it to several standard algorithms. In particular, we (almost entirely) restore the adaptivity to small losses of FTRL for unbounded domains, design and prove scale-free adaptive guarantees for a variant of Mirror Descent (at least when the Bregman divergence is convex in its second argument), extend Adapt-ML-Prod to scale-free guarantees, and provide several other minor contributions about Prod, AdaHedge, BOA and Soft-Bayes.
翻译:我们扩展并整合了文献中的多种工具,以设计快速、自适应、任意时刻且无尺度的在线学习算法。无尺度遗憾界必须随最大损失线性缩放,既适用于大损失也适用于极小损失。自适应遗憾界表明算法能够利用简单数据,并可能实现常数级遗憾。我们致力于开发依赖尽可能少参数(尤其是任意时刻性,即不依赖于时间范围)的快速算法。首个且主要的工具——等调谐(Isotuning),是平衡遗憾权衡思想的泛化。我们构建了一套工具集,以便轻松设计并分析此类学习率,并证明其能自动适应遗憾的收敛速率(常数级、$O(\log T)$、$O(\sqrt{T})$ 等),且与基于相同观测量的事后最优学习率相比,仅相差因子2。第二个工具是在线校正,可为多种算法获得中心化界,避免因域过大或部分约束不足而导致遗憾界无效。最后一个工具是空更新,可防止算法执行过大的更新,从而避免产生无界遗憾甚至无效更新。我们利用这些工具建立了一般性理论,并将其应用于多种标准算法。具体而言,我们(几乎完全)恢复了无界域下FTRL对小损失的自适应性,设计并证明了镜面下降(至少当Bregman散度在第二参数上凸时)变体的无尺度自适应保证,将Adapt-ML-Prod扩展至无尺度保障,并就Prod、AdaHedge、BOA及Soft-Bayes提供了若干次要贡献。