Quasar convexity is a condition that allows some first-order methods to efficiently minimize a function even when the optimization landscape is non-convex. Previous works develop near-optimal accelerated algorithms for minimizing this class of functions, however, they require a subroutine of binary search which results in multiple calls to gradient evaluations in each iteration, and consequently the total number of gradient evaluations does not match a known lower bound. In this work, we show that a recently proposed continuized Nesterov acceleration can be applied to minimizing quasar convex functions and achieves the optimal bound with a high probability. Furthermore, we find that the objective functions of training generalized linear models (GLMs) satisfy quasar convexity, which broadens the applicability of the relevant algorithms, while known practical examples of quasar convexity in non-convex learning are sparse in the literature. We also show that if a smooth and one-point strongly convex, Polyak-Lojasiewicz, or quadratic-growth function satisfies quasar convexity, then attaining an accelerated linear rate for minimizing the function is possible under certain conditions, while acceleration is not known in general for these classes of functions.
翻译:拟凸性是一种使一阶方法在优化景观非凸时仍能高效最小化函数性质的条件。现有研究已开发出接近最优的加速算法来最小化该函数类,但这类算法需要借助二分搜索子程序,导致每次迭代需多次调用梯度计算,使得梯度评估总数与已知下界不匹配。本文证明,近期提出的连续化Nesterov加速方法可应用于最小化拟凸函数,并以高概率达到最优界。此外,我们发现训练广义线性模型(GLMs)的目标函数满足拟凸性,这拓展了相关算法的适用范围,而文献中非凸学习领域已知的拟凸性实际案例较为稀少。我们还证明:若光滑且单点强凸、Polyak-Lojasiewicz或二次增长函数满足拟凸性,则在特定条件下可实现最小化该函数的加速线性收敛速率,而通常这些函数类是否可实现加速尚不明确。