Gradient-free optimization of highly smooth functions: improved analysis and a new algorithm

This work studies minimization problems with zero-order noisy oracle information under the assumption that the objective function is highly smooth and possibly satisfies additional properties. We consider two kinds of zero-order projected gradient descent algorithms, which differ in the form of the gradient estimator. The first algorithm uses a gradient estimator based on randomization over the $\ell_2$ sphere due to Bach and Perchet (2016). We present an improved analysis of this algorithm on the class of highly smooth and strongly convex functions studied in the prior work, and we derive rates of convergence for two more general classes of non-convex functions. Namely, we consider highly smooth functions satisfying the Polyak-{\L}ojasiewicz condition and the class of highly smooth functions with no additional property. The second algorithm is based on randomization over the $\ell_1$ sphere, and it extends to the highly smooth setting the algorithm that was recently proposed for Lipschitz convex functions in Akhavan et al. (2022). We show that, in the case of noiseless oracle, this novel algorithm enjoys better bounds on bias and variance than the $\ell_2$ randomization and the commonly used Gaussian randomization algorithms, while in the noisy case both $\ell_1$ and $\ell_2$ algorithms benefit from similar improved theoretical guarantees. The improvements are achieved thanks to a new proof techniques based on Poincar\'e type inequalities for uniform distributions on the $\ell_1$ or $\ell_2$ spheres. The results are established under weak (almost adversarial) assumptions on the noise. Moreover, we provide minimax lower bounds proving optimality or near optimality of the obtained upper bounds in several cases.

翻译：本研究关注在目标函数高度平滑且可能满足其他额外性质的假设下，利用零阶含噪Oracle信息求解最小化问题。我们考虑两种零阶投影梯度下降算法，其区别在于梯度估计器的形式不同。第一种算法采用基于$\ell_2$球面随机化的梯度估计器（由Bach和Perchet于2016年提出）。我们在先前工作研究的高度平滑且强凸函数类上对该算法进行了改进分析，并进一步推导了两种更一般的非凸函数类的收敛速率：满足Polyak-{\L}ojasiewicz条件的高度平滑函数类，以及不具备额外性质的高度平滑函数类。第二种算法基于$\ell_1$球面随机化，将Akhavan等人（2022年）近期针对Lipschitz凸函数提出的算法推广至高度平滑场景。研究表明，在无噪Oracle情形下，该新算法的偏差与方差界优于$\ell_2$随机化及常用的高斯随机化算法；而在含噪情形下，$\ell_1$与$\ell_2$两类算法均具有类似的改进理论保证。这些改进得益于基于$\ell_1$或$\ell_2$球面均匀分布Poincaré型不等式的新证明技术。所有结果均在弱（近乎对抗性）噪声假设下建立。此外，我们提供了极小极大下界，在若干情况下证明了所获上界的最优性或近最优性。