A Novel Gradient Methodology with Economical Objective Function Evaluations for Data Science Applications

Gradient methods are experiencing a growth in methodological and theoretical developments owing to the challenges of optimization problems arising in data science. Focusing on data science applications with expensive objective function evaluations yet inexpensive gradient function evaluations, gradient methods that never make objective function evaluations are either being rejuvenated or actively developed. However, as we show, such gradient methods are all susceptible to catastrophic divergence under realistic conditions for data science applications. In light of this, gradient methods which make use of objective function evaluations become more appealing, yet, as we show, can result in an exponential increase in objective evaluations between accepted iterates. As a result, existing gradient methods are poorly suited to the needs of optimization problems arising from data science. In this work, we address this gap by developing a generic methodology that economically uses objective function evaluations in a problem-driven manner to prevent catastrophic divergence and avoid an explosion in objective evaluations between accepted iterates. Our methodology allows for specific procedures that can make use of specific step size selection methodologies or search direction strategies, and we develop a novel step size selection methodology that is well-suited to data science applications. We show that a procedure resulting from our methodology is highly competitive with standard optimization methods on CUTEst test problems. We then show a procedure resulting from our methodology is highly favorable relative to standard optimization methods on optimization problems arising in our target data science applications. Thus, we provide a novel gradient methodology that is better suited to optimization problems arising in data science.

翻译：梯度方法因数据科学中优化问题的挑战而在方法论和理论发展上日益增长。针对目标函数评估昂贵但梯度评估廉价的数据科学应用，那些从不进行目标函数评估的梯度方法要么被重新激活，要么正在积极发展。然而，正如我们所示，这类梯度方法在数据科学应用的实际条件下均容易发生灾难性发散。鉴于此，利用目标函数评估的梯度方法更具吸引力，但正如我们所示，这可能导致接受迭代步之间目标函数评估次数呈指数增长。因此，现有梯度方法难以适应数据科学优化问题的需求。本文通过开发一种通用方法填补了这一空白，该方法以问题驱动的方式经济地使用目标函数评估，既能防止灾难性发散，又能避免接受迭代步之间目标函数评估的爆炸式增长。我们的方法允许采用特定步长选择策略或搜索方向策略的具体算法，并开发了一种适用于数据科学应用的新型步长选择方法。实验表明，该方法产生的算法在CUTEst测试问题上与标准优化方法相比具有高度竞争力。随后我们证明，在目标数据科学应用的优化问题上，该方法产生的算法相较于标准优化方法展现出显著优势。因此，我们提供了一种更适用于数据科学优化问题的新型梯度方法。