We present GenEFT: an effective theory framework for shedding light on the statics and dynamics of neural network generalization, and illustrate it with graph learning examples. We first investigate the generalization phase transition as data size increases, comparing experimental results with information-theory-based approximations. We find generalization in a Goldilocks zone where the decoder is neither too weak nor too powerful. We then introduce an effective theory for the dynamics of representation learning, where latent-space representations are modeled as interacting particles (repons), and find that it explains our experimentally observed phase transition between generalization and overfitting as encoder and decoder learning rates are scanned. This highlights the power of physics-inspired effective theories for bridging the gap between theoretical predictions and practice in machine learning.
翻译:摘要:本文提出GenEFT:一个用于揭示神经网络泛化静态与动态特性的有效理论框架,并以图学习为例进行说明。我们首先研究了随数据量增加出现的泛化相变现象,将实验结果与基于信息论的近似方法进行了对比。研究发现,在解码器既不弱也不强的“金发姑娘”区域中泛化效果最优。随后,我们针对表征学习的动力学过程提出了一种有效理论,其中隐空间中的表征被建模为相互作用的粒子(即表征元),该理论成功解释了在调整编码器与解码器学习率时,实验观察到的泛化与过拟合之间的相变现象。这凸显了受物理学启发的有效理论在弥合机器学习中理论预测与实践应用之间鸿沟方面的强大潜力。