In stochastic zeroth-order optimization, a problem of practical relevance is understanding how to fully exploit the local geometry of the underlying objective function. We consider a fundamental setting in which the objective function is quadratic, and provide the first tight characterization of the optimal Hessian-dependent sample complexity. Our contribution is twofold. First, from an information-theoretic point of view, we prove tight lower bounds on Hessian-dependent complexities by introducing a concept called energy allocation, which captures the interaction between the searching algorithm and the geometry of objective functions. A matching upper bound is obtained by solving the optimal energy spectrum. Then, algorithmically, we show the existence of a Hessian-independent algorithm that universally achieves the asymptotic optimal sample complexities for all Hessian instances. The optimal sample complexities achieved by our algorithm remain valid for heavy-tailed noise distributions, which are enabled by a truncation method.
翻译:在随机零阶优化中,一个具有实际意义的问题是如何充分利用目标函数的局部几何结构。我们考虑目标函数为二次型这一基础设定,并首次给出了依赖于Hessian矩阵的最优样本复杂度的精确刻画。我们的贡献体现在两个方面:第一,从信息论角度,通过引入称为"能量分配"的概念来捕捉搜索算法与目标函数几何结构之间的相互作用,我们证明了依赖于Hessian矩阵的复杂度下界是紧的。通过求解最优能量谱,我们得到了与之匹配的上界。第二,在算法层面,我们证明了存在一种不依赖于Hessian矩阵的算法,该算法对所有Hessian实例都能统一实现渐近最优的样本复杂度。通过采用截断方法,我们算法所实现的最优样本复杂度在重尾噪声分布下依然有效。