We use geometric invariant theory (GIT) to study the deep linear network (DLN). The Kempf-Ness theorem is used to establish that the $L^2$ regularizer is minimized on the balanced manifold. We introduce related balancing flows using the Riemannian geometry of fibers. The balancing flow defined by the $L^2$ regularizer is shown to converge to the balanced manifold at a uniform exponential rate. The balancing flow defined by the squared moment map is computed explicitly and shown to converge globally. This framework allows us to decompose the training dynamics into two distinct gradient flows: a regularizing flow on fibers and a learning flow on the balanced manifold. It also provides a common mathematical framework for balancedness in deep learning and linear systems theory. We use this framework to interpret balancedness in terms of fast-slow systems, model reduction and Bayesian principles.
翻译:我们利用几何不变量理论(GIT)研究深度线性网络(DLN)。基于Kempf-Ness定理,证明$L^2$正则化子仅在平衡流形上取得最小值。通过纤维的黎曼几何结构,我们引入了相关的平衡流。研究表明,由$L^2$正则化子定义的平衡流以一致指数速率收敛至平衡流形;而由平方矩映射定义的平衡流则显式可积,并具有全局收敛性。该理论框架将训练动力学分解为两个独立梯度流:纤维上的正则化流与平衡流形上的学习流,同时为深度学习与线性系统理论中的平衡性提供了统一数学框架。基于此,我们从快慢系统、模型降阶及贝叶斯原理三个视角对平衡性进行了解释。