We study the effect of high-order statistics of data on the learning dynamics of neural networks (NNs) by using a moment-controllable non-Gaussian data model. Considering the expressivity of two-layer neural networks, we first construct the data model as a generative two-layer NN where the activation function is expanded by using Hermite polynomials. This allows us to achieve interpretable control over high-order cumulants such as skewness and kurtosis through the Hermite coefficients while keeping the data model realistic. Using samples generated from the data model, we perform controlled online learning experiments with a two-layer NN. Our results reveal a moment-wise progression in training: networks first capture low-order statistics such as mean and covariance, and progressively learn high-order cumulants. Finally, we pretrain the generative model on the Fashion-MNIST dataset and leverage the generated samples for further experiments. The results of these additional experiments confirm our conclusions and show the utility of the data model in a real-world scenario. Overall, our proposed approach bridges simplified data assumptions and practical data complexity, which offers a principled framework for investigating distributional effects in machine learning and signal processing.
翻译:本研究通过采用一种矩可控的非高斯数据模型,探究数据的高阶统计量对神经网络学习动态的影响。考虑到双层神经网络的表达能力,我们首先将数据模型构建为生成式双层神经网络,其中激活函数通过埃尔米特多项式展开。这使得我们能够在保持数据模型真实性的同时,通过埃尔米特系数实现对偏度和峰度等高阶累积量的可解释控制。利用该数据模型生成的样本,我们对双层神经网络进行了受控在线学习实验。实验结果表明训练过程呈现矩级数式的进展:网络首先捕获均值和协方差等低阶统计量,随后逐步学习高阶累积量。最后,我们在Fashion-MNIST数据集上对该生成模型进行预训练,并利用生成的样本开展进一步实验。这些补充实验的结果验证了我们的结论,并展示了该数据模型在真实场景中的实用性。总体而言,我们提出的方法在简化数据假设与实际数据复杂性之间建立了桥梁,为研究机器学习和信号处理中的分布效应提供了理论框架。