Hybrid ensemble, an essential branch of ensembles, has flourished in the regression field, with studies confirming diversity's importance. However, previous ensembles consider diversity in the sub-model training stage, with limited improvement compared to single models. In contrast, this study automatically selects and weights sub-models from a heterogeneous model pool. It solves an optimization problem using an interior-point filtering linear-search algorithm. The objective function innovatively incorporates negative correlation learning as a penalty term, with which a diverse model subset can be selected. The best sub-models from each model class are selected to build the NCL ensemble, which performance is better than the simple average and other state-of-the-art weighting methods. It is also possible to improve the NCL ensemble with a regularization term in the objective function. In practice, it is difficult to conclude the optimal sub-model for a dataset prior due to the model uncertainty. Regardless, our method would achieve comparable accuracy as the potential optimal sub-models. In conclusion, the value of this study lies in its ease of use and effectiveness, allowing the hybrid ensemble to embrace diversity and accuracy.
翻译:混合集成方法作为集成学习的重要分支,在回归领域已取得显著发展,相关研究证实了多样性对集成效果的关键作用。然而,现有集成方法主要在子模型训练阶段引入多样性,相比单一模型的性能提升有限。本研究则从异质模型池中自动选择并加权子模型,通过内点滤波线性搜索算法求解优化问题。目标函数创新性地将负相关学习作为惩罚项纳入,从而筛选出具有多样性的模型子集。从每个模型类别中选取最优子模型构建的负相关学习集成模型,其性能优于简单平均法及其他先进加权方法。通过在目标函数中引入正则化项,还可进一步提升负相关学习集成的性能。实践中,由于模型不确定性,难以预先确定特定数据集的最优子模型。而本方法能够达到与潜在最优子模型相当的精度。本研究的价值在于其易用性和有效性,使得混合集成能够兼顾多样性与准确性。