A multitude of classifiers can be trained on the same data to achieve similar performances during test time, while having learned significantly different classification patterns. This phenomenon, which we call prediction discrepancies, is often associated with the blind selection of one model instead of another with similar performances. When making a choice, the machine learning practitioner has no understanding on the differences between models, their limits, where they agree and where they don't. But his/her choice will result in concrete consequences for instances to be classified in the discrepancy zone, since the final decision will be based on the selected classification pattern. Besides the arbitrary nature of the result, a bad choice could have further negative consequences such as loss of opportunity or lack of fairness. This paper proposes to address this question by analyzing the prediction discrepancies in a pool of best-performing models trained on the same data. A model-agnostic algorithm, DIG, is proposed to capture and explain discrepancies locally, to enable the practitioner to make the best educated decision when selecting a model by anticipating its potential undesired consequences. All the code to reproduce the experiments is available.
翻译:在相同数据上训练多个分类器,可在测试阶段获得相近的性能表现,但这些分类器可能已习得显著不同的分类模式。我们将这种现象称为预测差异,它通常与盲目选择某个模型而非其他性能相近的模型有关。在进行选择时,机器学习实践者往往无法理解模型间的差异、各自的局限性、以及它们何时达成一致或产生分歧。然而,这种选择将对处于差异区域的待分类实例产生具体影响,因为最终决策将基于所选分类模式。除了结果可能具有任意性外,不当选择还可能引发进一步负面后果,例如机会损失或公平性缺失。本文通过分析在相同数据上训练的一批最优性能模型中的预测差异来探讨该问题。我们提出了一种与模型无关的算法DIG,该算法能够局部捕获并解释预测差异,帮助实践者通过预判潜在不良后果,在模型选择时做出最优的理性决策。所有用于复现实验的代码均已公开。