We develop the theory of Energy Conserving Descent (ECD) and introduce ECDSep, a gradient-based optimization algorithm able to tackle convex and non-convex optimization problems. The method is based on the novel ECD framework of optimization as physical evolution of a suitable chaotic energy-conserving dynamical system, enabling analytic control of the distribution of results - dominated at low loss - even for generic high-dimensional problems with no symmetries. Compared to previous realizations of this idea, we exploit the theoretical control to improve both the dynamics and chaos-inducing elements, enhancing performance while simplifying the hyper-parameter tuning of the optimization algorithm targeted to different classes of problems. We empirically compare with popular optimization methods such as SGD, Adam and AdamW on a wide range of machine learning problems, finding competitive or improved performance compared to the best among them on each task. We identify limitations in our analysis pointing to possibilities for additional improvements.
翻译:我们发展了能量守恒下降法(ECD)的理论,并提出了ECDSep算法——一种能够处理凸优化和非凸优化问题的基于梯度的优化方法。该方法基于新颖的ECD框架,将优化问题视为适当混沌能量守恒动力系统的物理演化过程,从而能够对结果分布(以低损失为主导)进行解析控制,即便是在无对称性的通用高维问题中也是如此。与先前实现该理念的方法相比,我们利用理论控制来改进动力学和混沌诱导元素,在提升性能的同时简化了针对不同问题类别的优化算法的超参数调优。通过在广泛的机器学习问题上与SGD、Adam及AdamW等主流优化方法进行实证比较,我们发现在各项任务中本方法均能达到与最优方法相当或更优的性能。我们还指出了分析中的局限性,为未来改进提供了方向。