Deep Learning optimization involves minimizing a high-dimensional loss function in the weight space which is often perceived as difficult due to its inherent difficulties such as saddle points, local minima, ill-conditioning of the Hessian and limited compute resources. In this paper, we provide a comprehensive review of $14$ standard optimization methods successfully used in deep learning research and a theoretical assessment of the difficulties in numerical optimization from the optimization literature.
翻译:深度学习优化涉及在权重空间中最小化高维损失函数,由于鞍点、局部极小值、Hessian矩阵的病态条件以及有限的计算资源等固有难点,这一过程常被认为具有挑战性。本文全面回顾了深度学习研究中成功应用的14种标准优化方法,并从优化文献出发,对数值优化中的困难进行了理论评估。