We are interested in assessing the use of neural networks as surrogate models to approximate and minimize objective functions in optimization problems. While neural networks are widely used for machine learning tasks such as classification and regression, their application in solving optimization problems has been limited. Our study begins by determining the best activation function for approximating the objective functions of popular nonlinear optimization test problems, and the evidence provided shows that~SiLU has the best performance. We then analyze the accuracy of function value, gradient, and Hessian approximations for such objective functions obtained through interpolation/regression models and neural networks. When compared to interpolation/regression models, neural networks can deliver competitive zero- and first-order approximations (at a high training cost) but underperform on second-order approximation. However, it is shown that combining a neural net activation function with the natural basis for quadratic interpolation/regression can waive the necessity of including cross terms in the natural basis, leading to models with fewer parameters to determine. Lastly, we provide evidence that the performance of a state-of-the-art derivative-free optimization algorithm can hardly be improved when the gradient of an objective function is approximated using any of the surrogate models considered, including neural networks.
翻译:本文旨在评估神经网络作为替代模型在优化问题中逼近及最小化目标函数的应用效果。尽管神经网络已广泛应用于分类与回归等机器学习任务,其在求解优化问题中的应用仍较为有限。研究首先确定了逼近常见非线性优化测试问题目标函数的最佳激活函数,实验证据表明 SiLU 函数表现最优。随后分析了通过插值/回归模型和神经网络所获得的目标函数值、梯度及海森矩阵逼近结果的准确性。与插值/回归模型相比,神经网络能以高昂训练成本实现具有竞争力的零阶和一阶逼近,但二阶逼近性能欠佳。然而研究表明,将神经网络激活函数与二次插值/回归的自然基函数结合,可省去自然基中交叉项的必要性,从而减少待定参数数量。最后,本文证据表明,在采用任何替代模型(包括神经网络)逼近目标函数梯度时,现有最先进的无导数优化算法性能几乎无法得到提升。