Learning Beyond the Gaussian Data: Learning Dynamics of Neural Networks on an Expressive and Cumulant-Controllable Data Model

We study the effect of high-order statistics of data on the learning dynamics of neural networks (NNs) by using a moment-controllable non-Gaussian data model. Considering the expressivity of two-layer neural networks, we first construct the data model as a generative two-layer NN where the activation function is expanded by using Hermite polynomials. This allows us to achieve interpretable control over high-order cumulants such as skewness and kurtosis through the Hermite coefficients while keeping the data model realistic. Using samples generated from the data model, we perform controlled online learning experiments with a two-layer NN. Our results reveal a moment-wise progression in training: networks first capture low-order statistics such as mean and covariance, and progressively learn high-order cumulants. Finally, we pretrain the generative model on the Fashion-MNIST dataset and leverage the generated samples for further experiments. The results of these additional experiments confirm our conclusions and show the utility of the data model in a real-world scenario. Overall, our proposed approach bridges simplified data assumptions and practical data complexity, which offers a principled framework for investigating distributional effects in machine learning and signal processing.

翻译：本研究通过采用一种矩可控的非高斯数据模型，探究数据的高阶统计量对神经网络学习动态的影响。考虑到双层神经网络的表达能力，我们首先将数据模型构建为生成式双层神经网络，其中激活函数通过埃尔米特多项式展开。这使得我们能够在保持数据模型真实性的同时，通过埃尔米特系数实现对偏度和峰度等高阶累积量的可解释控制。利用该数据模型生成的样本，我们对双层神经网络进行了受控在线学习实验。实验结果表明训练过程呈现矩级数式的进展：网络首先捕获均值和协方差等低阶统计量，随后逐步学习高阶累积量。最后，我们在Fashion-MNIST数据集上对该生成模型进行预训练，并利用生成的样本开展进一步实验。这些补充实验的结果验证了我们的结论，并展示了该数据模型在真实场景中的实用性。总体而言，我们提出的方法在简化数据假设与实际数据复杂性之间建立了桥梁，为研究机器学习和信号处理中的分布效应提供了理论框架。

相关内容

神经网络

关注 5917

人工神经网络（Artificial Neural Network，即ANN ），是20世纪80 年代以来人工智能领域兴起的研究热点。它从信息处理角度对人脑神经元网络进行抽象，建立某种简单模型，按不同的连接方式组成不同的网络。在工程与学术界也常直接简称为神经网络或类神经网络。神经网络是一种运算模型，由大量的节点（或称神经元）之间相互联接构成。每个节点代表一种特定的输出函数，称为激励函数（activation function）。每两个节点间的连接都代表一个对于通过该连接信号的加权值，称之为权重，这相当于人工神经网络的记忆。网络的输出则依网络的连接方式，权重值和激励函数的不同而不同。而网络自身通常都是对自然界某种算法或者函数的逼近，也可能是对一种逻辑策略的表达。最近十多年来，人工神经网络的研究工作不断深入，已经取得了很大的进展，其在模式识别、智能机器人、自动控制、预测估计、生物、医学、经济等领域已成功地解决了许多现代计算机难以解决的实际问题，表现出了良好的智能特性。

美陆军研究报告《基于熵引导的深度神经网络加速收敛与性能提升方法》最新26页

专知会员服务

16+阅读 · 2025年7月3日

【NTU博士论文】数据高效的深度多模态学习

专知会员服务

25+阅读 · 2025年1月31日

【普林斯顿博士论文】深度学习方法用于发现高维神经数据中的可解释潜在动态

专知会员服务

32+阅读 · 2025年1月3日

【剑桥大学博士论文】在深度学习时代的可扩展贝叶斯推断：从高斯过程到深度神经网络

专知会员服务

56+阅读 · 2024年5月2日