In stochastic zeroth-order optimization, a problem of practical relevance is understanding how to fully exploit the local geometry of the underlying objective function. We consider a fundamental setting in which the objective function is quadratic, and provide the first tight characterization of the optimal Hessian-dependent sample complexity. Our contribution is twofold. First, from an information-theoretic point of view, we prove tight lower bounds on Hessian-dependent complexities by introducing a concept called energy allocation, which captures the interaction between the searching algorithm and the geometry of objective functions. A matching upper bound is obtained by solving the optimal energy spectrum. Then, algorithmically, we show the existence of a Hessian-independent algorithm that universally achieves the asymptotic optimal sample complexities for all Hessian instances. The optimal sample complexities achieved by our algorithm remain valid for heavy-tailed noise distributions, which are enabled by a truncation method.
翻译:在随机零阶优化中,一个具有实际意义的问题是如何充分利用目标函数的局部几何结构。我们考虑目标函数为二次型的基本设定,首次给出了最优Hessian依赖样本复杂度的精确刻画。本文贡献分为两方面:首先,从信息论角度,通过引入名为"能量分配"的概念(该概念捕捉了搜索算法与目标函数几何结构之间的相互作用),我们证明了关于Hessian依赖复杂度的紧下界,并通过求解最优能量谱获得了匹配的上界。其次,在算法层面,我们证明了存在一种与Hessian无关的算法,该算法对所有Hessian实例均能渐近达到最优样本复杂度。通过截断方法,我们算法所实现的最优样本复杂度在重尾噪声分布下仍然成立。