We study online linear optimization with matrix variables constrained by the operator norm, a setting where the geometry renders designing data-dependent and efficient adaptive algorithms challenging. The best-known adaptive regret bounds are achieved by Shampoo-like methods, but they require solving a costly quadratic projection subproblem. To address this, we extend the gradient-based prediction scheme to adaptive matrix online learning and cast algorithm design as constructing a family of smoothed potentials for the nuclear norm. We define a notion of admissibility for such smoothings and prove any admissible smoothing yields a regret bound matching the best-known guarantees of one-sided Shampoo. We instantiate this framework with two efficient methods that avoid quadratic projections. The first is an adaptive Follow-the-Perturbed-Leader (FTPL) method using Gaussian stochastic smoothing. The second is Follow-the-Augmented-Matrix-Leader (FAML), which uses a deterministic hyperbolic smoothing in an augmented matrix space. By analyzing the admissibility of these smoothings, we show both methods admit closed-form updates and match one-sided Shampoo's regret up to a constant factor, while significantly reducing computational cost. Lastly, using the online-to-nonconvex conversion, we derive two matrix-based optimizers, Pion (from FTPL) and Leon (from FAML). We prove convergence guarantees for these methods in nonsmooth nonconvex settings, a guarantee that the popular Muon optimizer lacks.
翻译:本文研究具有算子范数约束的矩阵变量在线线性优化问题,该场景的几何特性使得设计数据依赖且高效的自适应算法具有挑战性。目前已知的最佳自适应遗憾界由类Shampoo方法实现,但这些方法需要求解计算代价高昂的二次投影子问题。为解决此问题,我们将基于梯度的预测方案推广至自适应矩阵在线学习,并将算法设计转化为为核范数构造一族平滑势函数。我们为此类平滑定义了可容许性概念,并证明任何可容许的平滑所产生的遗憾界均与已知最佳的单边Shampoo算法保证相匹配。我们通过两种避免二次投影的高效方法实例化该框架:第一种是基于高斯随机平滑的自适应跟随扰动领导者(FTPL)方法;第二种是在增广矩阵空间中使用确定性双曲平滑的跟随增广矩阵领导者(FAML)方法。通过分析这些平滑的可容许性,我们证明两种方法均具有闭式更新形式,且能以常数因子匹配单边Shampoo的遗憾界,同时显著降低计算成本。最后,利用在线到非凸的转换,我们推导出两种基于矩阵的优化器:源自FTPL的Pion与源自FAML的Leon。我们证明了这些方法在非光滑非凸场景下的收敛性保证——这一保证是流行的Muon优化器所不具备的。