High-order tensor methods for solving both convex and nonconvex optimization problems have generated significant research interest, leading to algorithms with optimal global rates of convergence and local rates that are faster than Newton's method. On each iteration, these methods require the unconstrained local minimization of a (potentially nonconvex) multivariate polynomial of degree higher than two, constructed using third-order (or higher) derivative information, and regularized by an appropriate power of regularization. Developing efficient techniques for solving such subproblems is an ongoing topic of research, and this paper addresses the case of the third-order tensor subproblem. We propose the CQR algorithmic framework, for minimizing a nonconvex Cubic multivariate polynomial with Quartic Regularisation, by minimizing a sequence of local quadratic models that incorporate simple cubic and quartic terms. The role of the cubic term is to crudely approximate local tensor information, while the quartic one controls model regularization and progress. We provide necessary and sufficient optimality conditions that fully characterise the global minimizers of these cubic-quartic models. We then turn these conditions into secular equations that can be solved using nonlinear eigenvalue techniques. We show, using our optimality characterisations, that a CQR algorithmic variant has the optimal-order evaluation complexity of $\mathcal{O}(\epsilon^{-3/2})$ when applied to minimizing our quartically-regularised cubic subproblem, which can be further improved in special cases. We propose practical CQR variants that use local tensor information to construct the local cubic-quartic models. We test these variants numerically and observe them to be competitive with ARC and other subproblem solvers on typical instances and even superior on ill-conditioned subproblems with special structure.
翻译:高阶张量方法在求解凸与非凸优化问题中引发了广泛研究兴趣,这类方法实现了比牛顿法更快的局部收敛速度和最优全局收敛率。每次迭代需基于三阶(或更高阶)导数信息构造阶数大于二的多项式(可能非凸),并通过正则化项的适当幂次进行约束,进而求解该多项式的无约束局部极小化问题。开发高效求解此类子问题的技术仍是当前研究热点,本文聚焦三阶张量子问题。我们提出CQR算法框架,通过最小化包含简单三次项和四次项的局部二次模型序列,实现具有四次正则化的非凸三次多项式极小化。其中三次项用于粗略近似局部张量信息,四次项则控制模型正则化与收敛进程。我们给出了完全刻画这些三次-四次模型全局极小化点的充要最优性条件,并将其转化为可通过非线性特征值技术求解的久期方程。基于最优性特征证明,CQR算法变体在求解四次正则化三次子问题时具有最优阶评估复杂度$\mathcal{O}(\epsilon^{-3/2})$,且在特殊情形下可进一步优化。我们提出了利用局部张量信息构建局部三次-四次模型的实用CQR变体,数值实验表明,该类变体在典型问题中与ARC及其他子问题求解器性能相当,在具有特殊结构的病态子问题上表现更优。