The dynamics of gradient-based training in neural networks often exhibit nontrivial structures; hence, understanding them remains a central challenge in theoretical machine learning. In particular, a concept of feature unlearning, in which a neural network progressively loses previously learned features over long training, has gained attention. In this study, we consider the infinite-width limit of a two-layer neural network updated with a large-batch stochastic gradient, then derive differential equations with different time scales, revealing the mechanism and conditions for feature unlearning to occur. Specifically, we utilize the fast-slow dynamics: while an alignment of first-layer weights develops rapidly, the second-layer weights develop slowly. The direction of a flow on a critical manifold, determined by the slow dynamics, decides whether feature unlearning occurs. We give numerical validation of the result, and derive theoretical grounding and scaling laws of the feature unlearning. Our results yield the following insights: (i) the strength of the primary nonlinear term in data induces the feature unlearning, and (ii) an initial scale of the second-layer weights mitigates the feature unlearning. Technically, our analysis utilizes Tensor Programs and the singular perturbation theory.
翻译:基于梯度的神经网络训练动态常呈现非平凡结构,理解这些动态因此成为理论机器学习领域的核心挑战。其中,特征遗忘这一概念——即神经网络在长期训练过程中逐渐丧失先前学习到的特征——已引起广泛关注。本研究考虑采用大批量随机梯度更新的两层神经网络在无限宽度极限下的行为,推导出具有不同时间尺度的微分方程,从而揭示特征遗忘发生的机制与条件。具体而言,我们利用快慢动力学:第一层权重的对齐快速形成,而第二层权重则缓慢演化。由慢动力学决定的临界流形上的流动方向决定了特征遗忘是否发生。我们通过数值实验验证了该结果,并推导了特征遗忘的理论依据与标度律。研究得出以下洞见:(一)数据中主要非线性项的强度会诱发特征遗忘;(二)第二层权重的初始尺度可缓解特征遗忘。在技术上,我们的分析运用了张量程序与奇异摄动理论。