In the rapidly advancing domain of deep learning optimization, this paper unveils the StochGradAdam optimizer, a novel adaptation of the well-regarded Adam algorithm. Central to StochGradAdam is its gradient sampling technique. This method not only ensures stable convergence but also leverages the advantages of selective gradient consideration, fostering robust training by potentially mitigating the effects of noisy or outlier data and enhancing the exploration of the loss landscape for more dependable convergence. In both image classification and segmentation tasks, StochGradAdam has demonstrated superior performance compared to the traditional Adam optimizer. By judiciously sampling a subset of gradients at each iteration, the optimizer is optimized for managing intricate models. The paper provides a comprehensive exploration of StochGradAdam's methodology, from its mathematical foundations to bias correction strategies, heralding a promising advancement in deep learning training techniques.
翻译:在深度学习优化这一快速发展的领域中,本文提出了StochGradAdam优化器——一种对广受认可的Adam算法的新型改进。其核心在于梯度采样技术:该方法既能确保稳定收敛,又可利用选择性梯度考量带来的优势,通过潜在抑制噪声或异常数据影响、增强损失景观探索以实现更可靠的收敛,从而促进鲁棒性训练。在图像分类与分割任务中,StochGradAdam均展现出优于传统Adam优化器的性能。通过在每次迭代中审慎采样梯度子集,该优化器能高效处理复杂模型。本文从数学基础到偏差校正策略,对StochGradAdam的方法论进行了全面探索,预示着深度学习训练技术的重大进展。