In the rapidly advancing domain of deep learning optimization, this paper unveils the StochGradAdam optimizer, a novel adaptation of the well-regarded Adam algorithm. Central to StochGradAdam is its gradient sampling technique. This method not only ensures stable convergence but also leverages the advantages of selective gradient consideration, fostering robust training by potentially mitigating the effects of noisy or outlier data and enhancing the exploration of the loss landscape for more dependable convergence. In both image classification and segmentation tasks, StochGradAdam has demonstrated superior performance compared to the traditional Adam optimizer. By judiciously sampling a subset of gradients at each iteration, the optimizer is optimized for managing intricate models. The paper provides a comprehensive exploration of StochGradAdam's methodology, from its mathematical foundations to bias correction strategies, heralding a promising advancement in deep learning training techniques.
翻译:在深度学习优化这一快速发展的领域中,本文提出了一种名为StochGradAdam的优化器,这是对广受认可的Adam算法的创新性改进。StochGradAdam的核心在于其梯度采样技术:该方法不仅能确保稳定收敛,还可利用选择性梯度处理的优势,通过潜在降低噪声或异常数据的影响来增强训练的鲁棒性,同时通过更充分地探索损失曲面以实现更可靠的收敛。在图像分类与分割任务中,StochGradAdam相较传统Adam优化器均表现出更优性能。通过每轮迭代中审慎采样梯度子集,该优化器被优化用于处理复杂模型。本文从数学原理到偏差校正策略,对StochGradAdam的方法论进行了全面探讨,标志着深度学习训练技术的一项前瞻性突破。