In this paper, we introduce StochGradAdam, a novel optimizer designed as an extension of the Adam algorithm, incorporating stochastic gradient sampling techniques to improve computational efficiency while maintaining robust performance. StochGradAdam optimizes by selectively sampling a subset of gradients during training, reducing the computational cost while preserving the advantages of adaptive learning rates and bias corrections found in Adam. Our experimental results, applied to image classification and segmentation tasks, demonstrate that StochGradAdam can achieve comparable or superior performance to Adam, even when using fewer gradient updates per iteration. By focusing on key gradient updates, StochGradAdam offers stable convergence and enhanced exploration of the loss landscape, while mitigating the impact of noisy gradients. The results suggest that this approach is particularly effective for large-scale models and datasets, providing a promising alternative to traditional optimization techniques for deep learning applications.
翻译:本文提出StochGradAdam——一种基于Adam算法扩展的新型优化器,它通过融入随机梯度采样技术,在保持鲁棒性能的同时提升计算效率。StochGradAdam通过在训练过程中选择性采样梯度子集进行优化,在保留Adam算法自适应学习率与偏置校正优势的同时降低计算成本。我们在图像分类与分割任务上的实验结果表明,即使每次迭代使用更少的梯度更新,StochGradAdam仍能取得与Adam相当或更优的性能。通过聚焦关键梯度更新,StochGradAdam在抑制噪声梯度影响的同时,实现了稳定的收敛效果与损失空间的高效探索。实验结果表明该方法特别适用于大规模模型与数据集,为深度学习应用提供了传统优化技术之外的有效替代方案。