Methods with adaptive scaling of different features play a key role in solving saddle point problems, primarily due to Adam's popularity for solving adversarial machine learning problems, including GANS training. This paper carries out a theoretical analysis of the following scaling techniques for solving SPPs: the well-known Adam and RmsProp scaling and the newer AdaHessian and OASIS based on Hutchison approximation. We use the Extra Gradient and its improved version with negative momentum as the basic method. Experimental studies on GANs show good applicability not only for Adam, but also for other less popular methods.
翻译:具有不同特征自适应缩放的方法在解决鞍点问题中发挥着关键作用,这主要归因于Adam在解决对抗性机器学习问题(包括GAN训练)中的流行性。本文对以下用于解决鞍点问题的缩放技术进行了理论分析:著名的Adam和RmsProp缩放,以及基于Hutchison近似的较新的AdaHessian和OASIS。我们采用Extra Gradient及其带有负动量的改进版本作为基本方法。对GAN的实验研究表明,不仅Adam,其他不太常用的方法也具有良好的适用性。