Predictive models may generate biased predictions when classifying imbalanced datasets. This happens when the model favors the majority class, leading to low performance in accurately predicting the minority class. To address this issue, balancing or resampling methods are critical data-centric AI approaches in the modeling process to improve prediction performance. However, there have been debates and questions about the functionality of these methods in recent years. In particular, many candidate models may exhibit very similar predictive performance, called the Rashomon effect, in model selection, and they may even produce different predictions for the same observations. Selecting one of these models without considering the predictive multiplicity -- which is the case of yielding conflicting models' predictions for any sample -- can result in blind selection. In this paper, the impact of balancing methods on predictive multiplicity is examined using the Rashomon effect. It is crucial because the blind model selection in data-centric AI is risky from a set of approximately equally accurate models. This may lead to severe problems in model selection, validation, and explanation. To tackle this matter, we conducted real dataset experiments to observe the impact of balancing methods on predictive multiplicity through the Rashomon effect by using a newly proposed metric obscurity in addition to the existing ones: ambiguity and discrepancy. Our findings showed that balancing methods inflate the predictive multiplicity and yield varying results. To monitor the trade-off between the prediction performance and predictive multiplicity for conducting the modeling process responsibly, we proposed using the extended version of the performance-gain plot when balancing the training data.
翻译:在分类不平衡数据集时,预测模型可能产生有偏预测。当模型倾向于多数类时,会导致对少数类的准确预测性能低下。为解决此问题,平衡或重采样方法作为建模过程中以数据为中心的关键人工智能方法,对提升预测性能至关重要。然而近年来,关于这些方法的功能性一直存在争议和质疑。特别是在模型选择中,许多候选模型可能表现出极为相似的预测性能(称为拉什莫尔效应),甚至可能对相同观测值产生不同的预测。若在选择模型时未考虑预测多重性(即针对任何样本可能产生冲突的模型预测),则可能导致盲目选择。本文通过拉什莫尔效应研究了平衡方法对预测多重性的影响。鉴于在以数据为中心的人工智能中,从一组近似同等精确的模型中进行盲目选择具有风险,此研究至关重要。这可能引发模型选择、验证与解释方面的严重问题。为此,我们通过真实数据集实验,在现有指标(模糊性与差异性)基础上,结合新提出的遮蔽度指标,观测了平衡方法通过拉什莫尔效应对预测多重性的影响。研究结果表明,平衡方法会加剧预测多重性并产生多样化结果。为在平衡训练数据时负责任地推进建模过程,我们建议采用扩展版的性能增益图来监测预测性能与预测多重性之间的权衡关系。