To address the challenges of imbalanced multi-class datasets typically used for rare event detection in critical cyber-physical systems, we propose an optimal, efficient, and adaptable mixed integer programming (MIP) ensemble weighting scheme. Our approach leverages the diverse capabilities of the classifier ensemble on a granular per class basis, while optimizing the weights of classifier-class pairs using elastic net regularization for improved robustness and generalization. Additionally, it seamlessly and optimally selects a predefined number of classifiers from a given set. We evaluate and compare our MIP-based method against six well-established weighting schemes, using representative datasets and suitable metrics, under various ensemble sizes. The experimental results reveal that MIP outperforms all existing approaches, achieving an improvement in balanced accuracy ranging from 0.99% to 7.31%, with an overall average of 4.53% across all datasets and ensemble sizes. Furthermore, it attains an overall average increase of 4.63%, 4.60%, and 4.61% in macro-averaged precision, recall, and F1-score, respectively, while maintaining computational efficiency.
翻译:针对关键信息物理系统中罕见事件检测常用的非平衡多类数据集所面临的挑战,本文提出了一种最优、高效且适应性强的混合整数规划集成加权方案。该方法在细粒度的类别层面上利用分类器集成的多样化能力,同时通过弹性网络正则化优化分类器-类别对的权重,以提升模型的鲁棒性和泛化性能。此外,该方法能够从给定集合中无缝且最优地选择预定数量的分类器。我们使用代表性数据集和适宜的评价指标,在不同集成规模下,将所提出的基于混合整数规划的方法与六种成熟的加权方案进行了评估比较。实验结果表明,混合整数规划方法在所有现有方法中表现最优,其平衡准确率提升范围在0.99%至7.31%之间,在所有数据集和集成规模下的整体平均提升达到4.53%。此外,该方法在宏观平均精确率、召回率和F1分数上分别实现了4.63%、4.60%和4.61%的整体平均提升,同时保持了计算效率。