This paper introduces an iterative algorithm for training additive models that enjoys favorable memory storage and computational requirements. The algorithm can be viewed as the functional counterpart of stochastic gradient descent, applied to the coefficients of a truncated basis expansion of the component functions. We show that the resulting estimator satisfies an oracle inequality that allows for model mis-specification. In the well-specified setting, by choosing the learning rate carefully across three distinct stages of training, we demonstrate that its risk is minimax optimal in terms of the dependence on the dimensionality of the data and the size of the training sample. We further illustrate the computational benefits by comparing the approach with traditional backfitting on two real-world datasets.
翻译:本文提出了一种用于训练加法模型的迭代算法,该算法在内存存储与计算需求方面具有显著优势。该算法可视为随机梯度下降在函数空间中的对应方法,具体应用于分量函数截断基展开系数的优化。我们证明所得到的估计量满足允许模型误设的奥拉库不等式。在模型设定正确的情况下,通过在三阶段训练过程中精细调整学习率,我们证明了其风险在数据维度与训练样本规模依赖关系上达到了极小值最优。通过将本方法与两个真实数据集上的传统反向拟合方法进行对比,我们进一步阐释了其计算优势。