Hamilton and Moitra (2021) showed that, in certain regimes, it is not possible to accelerate Riemannian gradient descent in the hyperbolic plane if we restrict ourselves to algorithms which make queries in a (large) bounded domain and which receive gradients and function values corrupted by a (small) amount of noise. We show that acceleration remains unachievable for any deterministic algorithm which receives exact gradient and function-value information (unbounded queries, no noise). Our results hold for the classes of strongly and nonstrongly geodesically convex functions, and for a large class of Hadamard manifolds including hyperbolic spaces and the symmetric space $\mathrm{SL}(n) / \mathrm{SO}(n)$ of positive definite $n \times n$ matrices of determinant one. This cements a surprising gap between the complexity of convex optimization and geodesically convex optimization: for hyperbolic spaces, Riemannian gradient descent is optimal on the class of smooth and and strongly geodesically convex functions, in the regime where the condition number scales with the radius of the optimization domain. The key idea for proving the lower bound consists of perturbing the hard functions of Hamilton and Moitra (2021) with sums of bump functions chosen by a resisting oracle.
翻译:Hamilton与Moitra(2021)的研究表明,在特定条件下,若限制算法仅能在(大范围)有界域内查询,且接收的梯度与函数值被(小幅)噪声污染,则无法在双曲平面上加速黎曼梯度下降法。我们证明:对于接收精确梯度与函数值信息(无界查询、无噪声)的任意确定性算法,加速收敛仍不可实现。该结论适用于强测地凸函数与非强测地凸函数类,以及包含双曲空间和对称空间$\mathrm{SL}(n) / \mathrm{SO}(n)$(由行列式为1的$n \times n$正定矩阵构成)在内的广泛阿达玛流形。这一发现揭示了凸优化与测地凸优化之间令人惊讶的复杂度差距:在双曲空间中,当条件数随优化域半径缩放时,黎曼梯度下降法在光滑且强测地凸函数类上达到最优。证明下界的关键思想在于:通过抗阻预言机选取的钟形函数和,对Hamilton与Moitra(2021)的硬函数进行扰动。