While there are many works on the applications of machine learning, not so many of them are trying to understand the theoretical justifications to explain their efficiency. In this work, overfitting control (or generalization property) in machine learning is explained using analogies from physics and biology. For stochastic gradient Langevin dynamics, we show that the Eyring formula of kinetic theory allows to control overfitting in the algorithmic stability approach - when wide minima of the risk function with low free energy correspond to low overfitting. For the generative adversarial network (GAN) model, we establish an analogy between GAN and the predator-prey model in biology. An application of this analogy allows us to explain the selection of wide likelihood maxima and overfitting reduction for GANs.
翻译:尽管已有大量关于机器学习应用的研究,但较少有工作试图从理论层面解释其有效性的内在机理。本文通过物理学与生物学的类比,阐释机器学习中的过拟合控制(即泛化特性)。针对随机梯度朗之万动力学,我们证明动力学理论中的艾林公式可在算法稳定性框架中实现过拟合控制——当风险函数的宽最小值对应较低自由能时,其过拟合程度亦较低。对于生成对抗网络(GAN)模型,我们建立了GAN与生物学中捕食者-猎物模型的类比关系。应用该类比机制,我们解释了GAN对宽似然极大值的选择特性及其过拟合抑制机理。