Many machine learning models require setting a parameter that controls their size before training, e.g. number of neurons in DNNs, or inducing points in GPs. Increasing capacity typically improves performance until all the information from the dataset is captured. After this point, computational cost keeps increasing, without improved performance. This leads to the question "How big is big enough?" We investigate this problem for Gaussian processes (single-layer neural networks) in continual learning. Here, data becomes available incrementally, and the final dataset size will therefore not be known before training, preventing the use of heuristics for setting a fixed model size. We develop a method to automatically adjust model size while maintaining near-optimal performance. Our experimental procedure follows the constraint that any hyperparameters must be set without seeing dataset properties, and we show that our method performs well across diverse datasets without the need to adjust its hyperparameter, showing it requires less tuning than others.
翻译:许多机器学习模型在训练前需要设置控制其规模的参数,例如深度神经网络中的神经元数量,或高斯过程中的诱导点数量。增加模型容量通常会提升性能,直至捕获数据集中的所有信息。超过此临界点后,计算成本持续增加,性能却不再改善。这引出了“多大才足够?”的核心问题。本研究针对持续学习场景下的高斯过程(单层神经网络)探讨该问题。在持续学习中,数据以增量方式出现,最终数据集规模在训练前无法预知,这使得基于启发式方法设定固定模型规模的策略失效。我们提出一种能自动调整模型规模并保持近似最优性能的方法。实验设计遵循“超参数设置不得依赖数据集特性”的约束条件,结果表明:我们的方法在多样化数据集上均表现良好,且无需调整其超参数,证明其调优需求低于其他方法。