This paper presents a deep machine learning architecture, the "polyharmonic cascade" -- a sequence of packages of polyharmonic splines, where each layer is rigorously derived from the theory of random functions and the principles of indifference. This makes it possible to approximate nonlinear functions of arbitrary complexity while preserving global smoothness and a probabilistic interpretation. For the polyharmonic cascade, a training method alternative to gradient descent is proposed: instead of directly optimizing the coefficients, one solves a single global linear system on each batch with respect to the function values at fixed "constellations" of nodes. This yields synchronized updates of all layers, preserves the probabilistic interpretation of individual layers and theoretical consistency with the original model, and scales well: all computations reduce to 2D matrix operations efficiently executed on a GPU. Fast learning without overfitting on MNIST is demonstrated.
翻译:本文提出一种深度机器学习架构——“多谐级联”,即一系列多谐样条包的序列,其中每一层均严格源自随机函数理论和无差别原理。这使得在保持全局平滑性和概率解释的同时,逼近任意复杂度的非线性函数成为可能。针对多谐级联,本文提出了一种替代梯度下降的训练方法:不直接优化系数,而是在每个批次上针对固定“节点星座”处的函数值求解单一全局线性系统。该方法实现了所有层的同步更新,保持了各层的概率解释及与原始模型的理论一致性,并具备良好的可扩展性:所有计算均可简化为可在GPU上高效执行的二维矩阵运算。实验在MNIST数据集上展示了快速学习且无过拟合的性能。