Achieving More with Less: A Tensor-Optimization-Powered Ensemble Method

Ensemble learning is a method that leverages weak learners to produce a strong learner. However, obtaining a large number of base learners requires substantial time and computational resources. Therefore, it is meaningful to study how to achieve the performance typically obtained with many base learners using only a few. We argue that to achieve this, it is essential to enhance both classification performance and generalization ability during the ensemble process. To increase model accuracy, each weak base learner needs to be more efficiently integrated. It is observed that different base learners exhibit varying levels of accuracy in predicting different classes. To capitalize on this, we introduce confidence tensors $\tilde{\mathbf{\Theta}}$ and $\tilde{\mathbf{\Theta}}_{rst}$ signifies the degree of confidence that the $t$-th base classifier assigns the sample to class $r$ while it actually belongs to class $s$. To the best of our knowledge, this is the first time an evaluation of the performance of base classifiers across different classes has been proposed. The proposed confidence tensor compensates for the strengths and weaknesses of each base classifier in different classes, enabling the method to achieve superior results with a smaller number of base learners. To enhance generalization performance, we design a smooth and convex objective function that leverages the concept of margin, making the strong learner more discriminative. Furthermore, it is proved that in gradient matrix of the loss function, the sum of each column's elements is zero, allowing us to solve a constrained optimization problem using gradient-based methods. We then compare our algorithm with random forests of ten times the size and other classical methods across numerous datasets, demonstrating the superiority of our approach.

翻译：集成学习是一种利用弱学习器生成强学习器的方法。然而，获取大量基学习器需要大量时间和计算资源。因此，研究如何仅用少量基学习器达到通常需要大量基学习器才能获得的性能具有重要意义。我们认为，要实现这一目标，必须在集成过程中同时提升分类性能和泛化能力。为提高模型精度，需要更高效地整合每个弱基学习器。我们观察到，不同的基学习器在预测不同类别时表现出不同的准确度。为利用这一特性，我们引入了置信度张量 $\tilde{\mathbf{\Theta}}$，其中 $\tilde{\mathbf{\Theta}}_{rst}$ 表示第 $t$ 个基分类器将实际属于类别 $s$ 的样本预测为类别 $r$ 的置信程度。据我们所知，这是首次提出对基分类器在不同类别上的性能进行评估的方法。所提出的置信度张量能够补偿每个基分类器在不同类别上的优势与不足，从而使该方法能够以更少的基学习器数量取得更优的结果。为提升泛化性能，我们设计了一个平滑且凸的目标函数，该函数利用了间隔的概念，使得强学习器更具判别性。此外，我们证明了损失函数的梯度矩阵中，每列元素之和为零，这使得我们可以使用基于梯度的方法求解约束优化问题。随后，我们在多个数据集上将我们的算法与规模为其十倍的随机森林及其他经典方法进行了比较，结果证明了我们方法的优越性。