We investigate the convergence properties of a class of iterative algorithms designed to minimize a potentially non-smooth and noisy objective function, which may be algebraically intractable and whose values may be obtained as the output of a black box. The algorithms considered can be cast under the umbrella of a generalised gradient descent recursion, where the gradient is that of a smooth approximation of the objective function. The framework we develop includes as special cases model-based and mollification methods, two classical approaches to zero-th order optimisation. The convergence results are obtained under very weak assumptions on the regularity of the objective function and involve a trade-off between the degree of smoothing and size of the steps taken in the parameter updates. As expected, additional assumptions are required in the stochastic case. We illustrate the relevance of these algorithms and our convergence results through a challenging classification example from machine learning.
翻译:本文研究一类迭代算法的收敛性质,该算法旨在最小化可能非光滑且带有噪声的目标函数,这类函数可能代数上难以处理,其函数值可通过黑箱输出获得。所考虑的算法可归纳为广义梯度下降递推框架,其中梯度为目标函数光滑逼近的梯度。我们建立的框架包含基于模型的方法和磨光化方法这两种经典零阶优化特例。收敛性结果在目标函数正则性极弱的假设下获得,并涉及平滑程度与参数更新步长之间的权衡。如预期所示,随机情形需要额外假设。我们通过机器学习中一个具有挑战性的分类示例,阐释了这些算法及其收敛结果的实际意义。