In this paper, we provide a mathematical framework for improving generalization in a class of learning problems which is related to point estimations for modeling of high-dimensional nonlinear functions. In particular, we consider a variational problem for a weakly-controlled gradient system, whose control input enters into the system dynamics as a coefficient to a nonlinear term which is scaled by a small parameter. Here, the optimization problem consists of a cost functional, which is associated with how to gauge the quality of the estimated model parameters at a certain fixed final time w.r.t. the model validating dataset, while the weakly-controlled gradient system, whose the time-evolution is guided by the model training dataset and its perturbed version with small random noise. Using the perturbation theory, we provide results that will allow us to solve a sequence of optimization problems, i.e., a set of decomposed optimization problems, so as to aggregate the corresponding approximate optimal solutions that are reasonably sufficient for improving generalization in such a class of learning problems. Moreover, we also provide an estimate for the rate of convergence for such approximate optimal solutions. Finally, we present some numerical results for a typical case of nonlinear regression problem.
翻译:本文为改进一类与高维非线性函数建模点估计相关的学习问题的泛化性能提供了一个数学框架。具体而言,我们考虑一个弱控制梯度系统的变分问题,其控制输入作为非线性项的系数进入系统动力学,该非线性项由一个小参数进行缩放。此处的优化问题包含一个成本泛函,该泛函关联于如何在特定固定最终时间相对于模型验证数据集评估估计模型参数的质量;而弱控制梯度系统的时间演化则由模型训练数据集及其添加微小随机噪声的扰动版本所引导。利用摄动理论,我们提供了能够求解一系列优化问题(即一组分解后的优化问题)的结果,从而聚合对应的近似最优解,这些解对于改进此类学习问题的泛化性能具有合理充分的效力。此外,我们还给出了此类近似最优解收敛速率的估计。最后,我们针对一个典型的非线性回归问题案例展示了若干数值结果。