最优高阶优化的控制理论视角 (A Control-Theoretic Perspective on Optimal High-Order Optimization)

We provide a control-theoretic perspective on optimal tensor algorithms for minimizing a convex function in a finite-dimensional Euclidean space. Given a function $Φ: \mathbb{R}^d \rightarrow \mathbb{R}$ that is convex and twice continuously differentiable, we study a closed-loop control system that is governed by the operators $\nabla Φ$ and $\nabla^2 Φ$ together with a feedback control law $λ(\cdot)$ satisfying the algebraic equation $(λ(t))^p\|\nablaΦ(x(t))\|^{p-1} = θ$ for some $θ\in (0, 1)$. Our first contribution is to prove the existence and uniqueness of a local solution to this system via the Banach fixed-point theorem. We present a simple yet nontrivial Lyapunov function that allows us to establish the existence and uniqueness of a global solution under certain regularity conditions and analyze the convergence properties of trajectories. The rate of convergence is $O(1/t^{(3p+1)/2})$ in terms of objective function gap and $O(1/t^{3p})$ in terms of squared gradient norm. Our second contribution is to provide two algorithmic frameworks obtained from discretization of our continuous-time system, one of which generalizes the large-step A-HPE framework and the other of which leads to a new optimal $p$-th order tensor algorithm. While our discrete-time analysis can be seen as a simplification and generalization of~\citet{Monteiro-2013-Accelerated}, it is largely motivated by the aforementioned continuous-time analysis, demonstrating the fundamental role that the feedback control plays in optimal acceleration and the clear advantage that the continuous-time perspective brings to algorithmic design. A highlight of our analysis is that we show that all of the $p$-th order optimal tensor algorithms that we discuss minimize the squared gradient norm at a rate of $O(k^{-3p})$, which complements the recent analysis.

翻译：本文从控制理论视角探讨有限维欧几里得空间中凸函数最小化的最优张量算法。给定一个凸且二次连续可微的函数 $Φ: \mathbb{R}^d \rightarrow \mathbb{R}$，我们研究一个由算子 $\nabla Φ$ 和 $\nabla^2 Φ$ 以及满足代数方程 $(λ(t))^p\|\nablaΦ(x(t))\|^{p-1} = θ$（其中 $θ\in (0, 1)$）的反馈控制律 $λ(\cdot)$ 共同支配的闭环控制系统。我们的第一个贡献是通过巴拿赫不动点定理证明了该系统局部解的存在唯一性。我们提出了一个简单但非平凡的Lyapunov函数，使我们能够在一定的正则性条件下建立全局解的存在唯一性，并分析轨迹的收敛特性。在目标函数间隙方面的收敛速度为 $O(1/t^{(3p+1)/2})$，在梯度范数平方方面的收敛速度为 $O(1/t^{3p})$。我们的第二个贡献是提供了通过连续时间系统离散化得到的两个算法框架，其中一个推广了大步长A-HPE框架，另一个则引出了新的最优 $p$ 阶张量算法。虽然我们的离散时间分析可被视为对~\citet{Monteiro-2013-Accelerated} 工作的简化和推广，但它主要受前述连续时间分析的启发，证明了反馈控制在最优加速中扮演的根本性角色，以及连续时间视角为算法设计带来的明显优势。我们分析的一个亮点是，证明了所讨论的所有 $p$ 阶最优张量算法均以 $O(k^{-3p})$ 的速率最小化梯度范数平方，这补充了最近的分析结果。