We propose a custom learning algorithm for shallow over-parameterized neural networks, i.e., networks with single hidden layer having infinite width. The infinite width of the hidden layer serves as an abstraction for the over-parameterization. Building on the recent mean field interpretations of learning dynamics in shallow neural networks, we realize mean field learning as a computational algorithm, rather than as an analytical tool. Specifically, we design a Sinkhorn regularized proximal algorithm to approximate the distributional flow for the learning dynamics over weighted point clouds. In this setting, a contractive fixed point recursion computes the time-varying weights, numerically realizing the interacting Wasserstein gradient flow of the parameter distribution supported over the neuronal ensemble. An appealing aspect of the proposed algorithm is that the measure-valued recursions allow meshless computation. We demonstrate the proposed computational framework of interacting weighted particle evolution on binary and multi-class classification. Our algorithm performs gradient descent of the free energy associated with the risk functional.
翻译:我们提出了一种针对浅层过参数化神经网络(即具有无穷宽单隐藏层的网络)的自定义学习算法。隐藏层的无穷宽度作为过参数化的一种抽象表示。基于近期对浅层神经网络学习动力学的平均场解释,我们将平均场学习实现为计算算法而非分析工具。具体而言,我们设计了一种基于Sinkhorn正则化的近端算法,以逼近加权点云上学习动力学的分布流。在该框架中,通过收缩型不动点递归计算时变权重,数值实现了神经元集成上参数分布相互作用的Wasserstein梯度流。所提出算法的一个吸引人之处在于:测度值递归允许无网格计算。我们在二分类和多分类任务上验证了这种交互式加权粒子演化计算框架。该算法对应于风险泛函自由能的梯度下降过程。