We propose a new regression algorithm that learns from a set of input-output pairs. Our algorithm is designed for populations where the relation between the input variables and the output variable exhibits a heterogeneous behavior across the predictor space. The algorithm starts with generating subsets that are concentrated around random points in the input space. This is followed by training a local predictor for each subset. Those predictors are then combined in a novel way to yield an overall predictor. We call this algorithm "LEarning with Subset Stacking" or LESS, due to its resemblance to the method of stacking regressors. We offer bagging and boosting variants of LESS and test against the state-of-the-art methods on several datasets. Our comparison shows that LESS is highly competitive.
翻译:我们提出一种新的回归算法,该算法从一组输入-输出对中学习。本算法专为输入变量与输出变量之间的关系在预测空间内呈现异质行为的总体而设计。算法首先生成集中于输入空间中随机点的子集,随后为每个子集训练一个局部预测器。这些预测器通过一种新颖的方式组合,从而产生整体预测器。由于该方法与回归器堆叠方法的相似性,我们将其称为“基于子集堆叠的学习”或LESS。我们提供了LESS的装袋与提升变体,并在多个数据集上对比测试了当前最先进的方法。比较结果表明LESS具有高度竞争力。