Feature bagging is a well-established ensembling method which aims to reduce prediction variance by combining predictions of many estimators trained on subsets or projections of features. Here, we develop a theory of feature-bagging in noisy least-squares ridge ensembles and simplify the resulting learning curves in the special case of equicorrelated data. Using analytical learning curves, we demonstrate that subsampling shifts the double-descent peak of a linear predictor. This leads us to introduce heterogeneous feature ensembling, with estimators built on varying numbers of feature dimensions, as a computationally efficient method to mitigate double-descent. Then, we compare the performance of a feature-subsampling ensemble to a single linear predictor, describing a trade-off between noise amplification due to subsampling and noise reduction due to ensembling. Our qualitative insights carry over to linear classifiers applied to image classification tasks with realistic datasets constructed using a state-of-the-art deep learning feature map.
翻译:特征装袋是一种成熟的集成方法,旨在通过结合基于特征子集或投影训练的多个估计器的预测来降低预测方差。本文针对含噪最小二乘岭集成建立了特征装袋理论,并在等相关数据的特殊情形下简化了由此产生的学习曲线。通过解析学习曲线,我们证明了子采样会使线性预测器的双重下降峰值发生偏移。这促使我们引入了一种基于不同特征维度构建估计器的异质特征集成方法,作为缓解双重下降的计算高效手段。随后,我们将特征子采样集成与单一线性预测器的性能进行比较,揭示了子采样导致的噪声放大与集成带来的噪声降低之间的权衡。我们的定性见解可推广至线性分类器,这些分类器应用于基于最先进深度学习特征映射构建的现实数据集上的图像分类任务。