When artificial neural networks have demonstrated exceptional practical success in a variety of domains, investigations into their theoretical characteristics, such as their approximation power, statistical properties, and generalization performance, have concurrently made significant strides. In this paper, we construct a novel theory for understanding the effectiveness of neural networks, which offers a perspective distinct from prior research. Specifically, we explore the rationale underlying a common practice during the construction of neural network models: sample splitting. Our findings indicate that the optimal hyperparameters derived from sample splitting can enable a neural network model that asymptotically minimizes the prediction risk. We conduct extensive experiments across different application scenarios and network architectures, and the results manifest our theory's effectiveness.
翻译:当人工神经网络在多个领域展现出卓越的实际成功时,对其理论特性(如逼近能力、统计性质及泛化性能)的研究也取得了显著进展。本文构建了一种理解神经网络有效性的新理论,该理论提供了与先前研究不同的视角。具体而言,我们探讨了神经网络模型构建过程中一种常见做法背后的原理:样本分割。我们的研究结果表明,通过样本分割得到的最优超参数能够使神经网络模型渐近地最小化预测风险。我们在不同的应用场景和网络架构下进行了大量实验,实验结果证明了该理论的有效性。