We study the complexity of producing $(\delta,\epsilon)$-stationary points of Lipschitz objectives which are possibly neither smooth nor convex, using only noisy function evaluations. Recent works proposed several stochastic zero-order algorithms that solve this task, all of which suffer from a dimension-dependence of $\Omega(d^{3/2})$ where $d$ is the dimension of the problem, which was conjectured to be optimal. We refute this conjecture by providing a faster algorithm that has complexity $O(d\delta^{-1}\epsilon^{-3})$, which is optimal (up to numerical constants) with respect to $d$ and also optimal with respect to the accuracy parameters $\delta,\epsilon$, thus solving an open question due to Lin et al. (NeurIPS'22). Moreover, the convergence rate achieved by our algorithm is also optimal for smooth objectives, proving that in the nonconvex stochastic zero-order setting, nonsmooth optimization is as easy as smooth optimization. We provide algorithms that achieve the aforementioned convergence rate in expectation as well as with high probability. Our analysis is based on a simple yet powerful geometric lemma regarding the Goldstein-subdifferential set, which allows utilizing recent advancements in first-order nonsmooth nonconvex optimization.
翻译:我们研究了仅使用带噪声的函数值评估来生成Lipschitz目标函数的$(\delta,\epsilon)$-平稳点的复杂度,这些目标函数可能既非光滑也非凸。最近的研究提出了几种解决此问题的随机零阶算法,但这些算法都受到$\Omega(d^{3/2})$的维度依赖性困扰,其中$d$是问题的维度,这曾被猜想为最优结果。我们推翻了这一猜想,提出了一种更快的算法,其复杂度为$O(d\delta^{-1}\epsilon^{-3})$,在$d$方面达到最优(忽略数值常数),同时对于精度参数$\delta,\epsilon$也是最优的,从而解决了Lin等人(NeurIPS'22)提出的一个开放问题。此外,我们的算法对光滑目标函数也实现了最优收敛率,证明在非凸随机零阶场景中,非光滑优化与光滑优化同样容易。我们提供了在期望值和高概率条件下均能达到上述收敛率的算法。我们的分析基于一个关于Goldstein-次微分集的简单而强大的几何引理,该引理使得我们能够利用一阶非光滑非凸优化的最新进展。