Online eXp-concave Optimization (OXO) is a fundamental problem in online learning, where the goal is to minimize regret when loss functions are exponentially concave. The standard algorithm, Online Newton Step (ONS), guarantees an optimal $O(d \log T)$ regret, where $d$ is the dimension and $T$ is the time horizon. Despite its simplicity, ONS may face a computational bottleneck due to the Mahalanobis projection at each round. This step costs $Ω(d^ω)$ arithmetic operations for bounded domains, even for simple domains such as the unit ball, where $ω\in (2,3]$ is the matrix-multiplication exponent. As a result, the total runtime can reach $\tilde{O}(d^ωT)$, particularly when iterates frequently oscillate near the domain boundary. This paper proposes a simple variant of ONS, called LightONS, which reduces the total runtime to $O(d^2 T + d^ω\sqrt{T \log T})$ while preserving the optimal regret. Deploying LightONS with the online-to-batch conversion implies a method for stochastic exp-concave optimization with runtime $\tilde{O}(d^3/ε)$, thereby answering an open problem posed by Koren [2013]. The design leverages domain-conversion techniques from parameter-free online learning and defers expensive Mahalanobis projections until necessary, thereby preserving the elegant structure of ONS and enabling LightONS to act as an efficient plug-in replacement in broader scenarios, including gradient-norm adaptivity, parametric stochastic bandits, and memory-efficient OXO.
翻译:在线指数凹优化(OXO)是在线学习中的一个基本问题,其目标是在损失函数为指数凹函数时最小化遗憾。标准算法在线牛顿步(ONS)保证了最优的 $O(d \log T)$ 遗憾界,其中 $d$ 为维度,$T$ 为时间范围。尽管简单,但ONS可能因每轮所需的马氏投影而面临计算瓶颈。对于有界域(即使是单位球这类简单域),该步骤的算术运算成本为 $Ω(d^ω)$,其中 $ω\in (2,3]$ 是矩阵乘法指数。因此,当迭代点频繁在域边界附近振荡时,总运行时间可能达到 $\tilde{O}(d^ωT)$。本文提出了一种简单的ONS变体,称为LightONS,它在保持最优遗憾界的同时,将总运行时间降低至 $O(d^2 T + d^ω\sqrt{T \log T})$。通过在线到批处理的转换部署LightONS,意味着得到了一种用于随机指数凹优化的方法,其运行时间为 $\tilde{O}(d^3/ε)$,从而回答了Koren [2013]提出的一个开放性问题。该设计利用了无参数在线学习中的域转换技术,并将昂贵的马氏投影推迟到必要时才执行,从而保留了ONS的优雅结构,并使LightONS能够在更广泛的场景中(包括梯度范数自适应性、参数化随机赌博机以及内存高效的OXO)作为高效的即插即用替代方案。