We consider the problem of function approximation by two-layer neural nets with random weights that are "nearly Gaussian" in the sense of Kullback-Leibler divergence. Our setting is the mean-field limit, where the finite population of neurons in the hidden layer is replaced by a continuous ensemble. We show that the problem can be phrased as global minimization of a free energy functional on the space of (finite-length) paths over probability measures on the weights. This functional trades off the $L^2$ approximation risk of the terminal measure against the KL divergence of the path with respect to an isotropic Brownian motion prior. We characterize the unique global minimizer and examine the dynamics in the space of probability measures over weights that can achieve it. In particular, we show that the optimal path-space measure corresponds to the F\"ollmer drift, the solution to a McKean-Vlasov optimal control problem closely related to the classic Schr\"odinger bridge problem. While the F\"ollmer drift cannot in general be obtained in closed form, thus limiting its potential algorithmic utility, we illustrate the viability of the mean-field Langevin diffusion as a finite-time approximation under various conditions on entropic regularization. Specifically, we show that it closely tracks the F\"ollmer drift when the regularization is such that the minimizing density is log-concave.
翻译:本文研究具有随机权重的两层神经网络函数逼近问题,其中权重分布满足Kullback-Leibler散度意义下的"近似高斯性"。我们采用均值场极限框架,将隐藏层有限神经元群体替换为连续集合。研究表明,该问题可表述为权重概率测度空间上(有限长度)路径的自由能泛函全局最小化问题。该泛函权衡了终端测度的$L^2$逼近风险与路径相对于各向同性布朗运动先验的KL散度。我们刻画了唯一的全局最小化子,并分析了在权重概率测度空间中实现该目标的动力学过程。特别地,我们证明最优路径空间测度对应于F\"ollmer漂移——该解与经典Schr\"odinger桥问题密切相关的McKean-Vlasov最优控制问题。虽然F\"ollmer漂移通常无法获得闭式解,从而限制了其算法实用性,但我们通过熵正则化的多种条件论证了均值场Langevin扩散作为有限时间逼近的可行性。具体而言,当正则化使最小化密度满足对数凹性时,该扩散过程能紧密追踪F\"ollmer漂移。