The performance improvement of deep networks significantly depends on their optimizers. With existing optimizers, precise and efficient recognition of the gradients trend remains a challenge. Existing optimizers predominantly adopt techniques based on the first-order exponential moving average (EMA), which results in noticeable delays that impede the real-time tracking of gradients trend and consequently yield sub-optimal performance. To overcome this limitation, we introduce a novel optimizer called fast-adaptive moment estimation (FAME). Inspired by the triple exponential moving average (TEMA) used in the financial domain, FAME leverages the potency of higher-order TEMA to improve the precision of identifying gradient trends. TEMA plays a central role in the learning process as it actively influences optimization dynamics; this role differs from its conventional passive role as a technical indicator in financial contexts. Because of the introduction of TEMA into the optimization process, FAME can identify gradient trends with higher accuracy and fewer lag issues, thereby offering smoother and more consistent responses to gradient fluctuations compared to conventional first-order EMA. To study the effectiveness of our novel FAME optimizer, we conducted comprehensive experiments encompassing six diverse computer-vision benchmarks and tasks, spanning detection, classification, and semantic comprehension. We integrated FAME into 15 learning architectures and compared its performance with those of six popular optimizers. Results clearly showed that FAME is more robust and accurate and provides superior performance stability by minimizing noise (i.e., trend fluctuations). Notably, FAME achieves higher accuracy levels in remarkably fewer training epochs than its counterparts, clearly indicating its significance for optimizing deep networks in computer-vision tasks.
翻译:深度网络性能的提升高度依赖于其优化器。现有优化器在精准高效识别梯度趋势方面仍存在挑战。当前主流优化器主要采用一阶指数移动平均技术,导致显著延迟,阻碍梯度趋势的实时追踪,进而造成次优性能。为克服这一局限,我们提出名为快速自适应矩估计的新型优化器。受金融领域三重指数移动平均启发,FAME利用高阶TEMA的潜力提升梯度趋势识别精度。TEMA在学习过程中发挥核心作用,主动影响优化动态,这与它在金融领域作为被动技术指标的传统角色截然不同。由于在优化过程中引入TEMA,FAME能更准确识别梯度趋势且滞后问题更少,相比传统一阶EMA对梯度波动提供更平滑一致的反应。为验证新型FAME优化器的有效性,我们在涵盖检测、分类和语义理解六大计算机视觉基准测试中开展全面实验,将FAME集成到15种学习架构中,并与六种主流优化器进行性能对比。结果表明,FAME通过最小化噪声(即趋势波动)展现出更强的鲁棒性和准确性,并提供卓越的性能稳定性。值得注意的是,FAME在显著更少的训练周期内达到更高精度水平,充分彰显其在计算机视觉任务中优化深度网络的重要意义。