On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking

We present a comprehensive analysis of how two-layer neural networks learn features to solve the modular addition task. Our work provides a full mechanistic interpretation of the learned model and a theoretical explanation of its training dynamics. While prior work has identified that individual neurons learn single-frequency Fourier features and phase alignment, it does not fully explain how these features combine into a global solution. We bridge this gap by formalizing a diversification condition that emerges during training when overparametrized, consisting of two parts: phase symmetry and frequency diversification. We prove that these properties allow the network to collectively approximate a flawed indicator function on the correct logic for the modular addition task. While individual neurons produce noisy signals, the phase symmetry enables a majority-voting scheme that cancels out noise, allowing the network to robustly identify the correct sum. Furthermore, we explain the emergence of these features under random initialization via a lottery ticket mechanism. Our gradient flow analysis proves that frequencies compete within each neuron, with the "winner" determined by its initial spectral magnitude and phase alignment. From a technical standpoint, we provide a rigorous characterization of the layer-wise phase coupling dynamics and formalize the competitive landscape using the ODE comparison lemma. Finally, we use these insights to demystify grokking, characterizing it as a three-stage process involving memorization followed by two generalization phases, driven by the competition between loss minimization and weight decay.

翻译：本文对双层神经网络如何学习特征以解决模块加法任务进行了全面分析。我们的工作为所学模型提供了完整的机制性解释，并对其训练动力学给出了理论说明。尽管先前研究已发现单个神经元学习单频傅里叶特征并进行相位对齐，但未能完全解释这些特征如何组合成全局解。我们通过形式化训练过程中在过参数化条件下出现的多样化条件来弥合这一差距，该条件包含两部分：相位对称性与频率多样化。我们证明这些特性使网络能够集体逼近模块加法任务正确逻辑上的一个有缺陷的指示函数。虽然单个神经元产生噪声信号，但相位对称性实现了多数表决机制以抵消噪声，使网络能够稳健识别正确和。此外，我们通过彩票假设机制解释了这些特征在随机初始化下的涌现。我们的梯度流分析证明频率在每个神经元内部竞争，“胜者”由其初始频谱幅值与相位对齐决定。从技术角度，我们严格刻画了逐层相位耦合动力学，并利用ODE比较引理形式化了竞争格局。最后，我们运用这些见解揭示了“顿悟”现象，将其描述为包含记忆阶段及两个泛化阶段的三阶段过程，由损失最小化与权重衰减之间的竞争所驱动。