Curvature adaptivity is a classical theme in online optimization: for convex Lipschitz losses, adaptive methods interpolate between the optimal $O(\sqrt{T})$ regret for general convex losses and $O(\log T)$ regret under strong convexity. Recent work has shown that Follow-the-Perturbed-Leader (FTPL) achieves optimal $O(\sqrt{T})$ regret even for online non-convex Lipschitz losses, assuming access to an approximate offline-optimization oracle, but these guarantees do not exploit curvature. We show that FTPL can be made curvature-adaptive in the non-convex setting, without knowing in advance how curvature will accumulate over time. Our algorithm replaces the fixed perturbation scale of standard FTPL with a time-varying scale chosen using only past information. We give a simple follow-the-leader tuning rule for this scale and show that it competes, up to constants, with the best choice in hindsight. The resulting method achieves $O(\sqrt{T})$ regret for arbitrary non-convex Lipschitz losses and improves as cumulative curvature grows; with sufficiently accurate oracle calls, it achieves $O(\log T)$ regret when cumulative curvature grows linearly, which includes the classical strongly convex regime. We complement these upper bounds with matching lower bounds for prescribed cumulative-curvature sequences, already for one-dimensional convex losses, showing that the tradeoff between worst-case non-convex regret and curvature-driven fast rates is intrinsic.
翻译:曲率自适应性是在线优化中的一个经典主题:对于凸利普希茨损失,自适应方法在一般凸损失的最优 $O(\sqrt{T})$ 遗憾与强凸性下的 $O(\log T)$ 遗憾之间进行插值。近期研究表明,假设能够访问一个近似离线优化预言机,则跟随扰动领导者(FTPL)即使在在线非凸利普希茨损失下也能实现最优 $O(\sqrt{T})$ 遗憾,但这些保证并未利用曲率。我们证明,在非凸设定下,FTPL 可以做到曲率自适应,而无需事先知道曲率随时间如何累积。我们的算法将标准 FTPL 中固定的扰动尺度替换为仅利用过去信息选取的时变尺度。我们针对该尺度给出一个简单的跟随领导者调整规则,并证明其与事后最优选择(至多相差常数倍)具有竞争力。所得到的方法对任意非凸利普希茨损失实现 $O(\sqrt{T})$ 遗憾,并随累积曲率增长而改进;在具有足够精确的预言机调用时,若累积曲率线性增长(包括经典的强凸情形),它可实现 $O(\log T)$ 遗憾。我们通过对指定的累积曲率序列(即使在一维凸损失情形下)给出匹配的下界,来补充这些上界,从而表明最坏情况下非凸遗憾与曲率驱动的快速收敛率之间的权衡是内在的。