Human-Aligned Calibration for AI-Assisted Decision Making

Whenever a binary classifier is used to provide decision support, it typically provides both a label prediction and a confidence value. Then, the decision maker is supposed to use the confidence value to calibrate how much to trust the prediction. In this context, it has been often argued that the confidence value should correspond to a well calibrated estimate of the probability that the predicted label matches the ground truth label. However, multiple lines of empirical evidence suggest that decision makers have difficulties at developing a good sense on when to trust a prediction using these confidence values. In this paper, our goal is first to understand why and then investigate how to construct more useful confidence values. We first argue that, for a broad class of utility functions, there exist data distributions for which a rational decision maker is, in general, unlikely to discover the optimal decision policy using the above confidence values -- an optimal decision maker would need to sometimes place more (less) trust on predictions with lower (higher) confidence values. However, we then show that, if the confidence values satisfy a natural alignment property with respect to the decision maker's confidence on her own predictions, there always exists an optimal decision policy under which the level of trust the decision maker would need to place on predictions is monotone on the confidence values, facilitating its discoverability. Further, we show that multicalibration with respect to the decision maker's confidence on her own predictions is a sufficient condition for alignment. Experiments on four different AI-assisted decision making tasks where a classifier provides decision support to real human experts validate our theoretical results and suggest that alignment may lead to better decisions.

翻译：每当二分类器用于提供决策支持时，它通常同时输出标签预测和置信度值。决策者应利用置信度值来校准对预测的信任程度。在此背景下，常有观点认为置信度值应对应于预测标签与真实标签匹配概率的校准估计。然而，多项实证研究表明，决策者难以通过置信度值建立何时信任预测的准确感知。本文首先旨在理解这一现象的原因，进而探究如何构建更有用的置信度值。我们首先论证，对于广泛类别的效用函数，存在某些数据分布，使得理性决策者普遍难以基于上述置信度值发现最优决策策略——最优决策者有时需要对低置信度预测给予更多信任，而对高置信度预测给予更少信任。然而，我们进一步表明，若置信度值相对于决策者自身预测的置信度满足自然对齐性质，则存在一种最优决策策略，使得决策者对预测所需信任程度与置信度值保持单调关系，从而便于该策略的发现。此外，我们证明，对决策者自身预测置信度进行多重校准是对齐性质的充分条件。在四项不同的人工辅助决策任务（分类器为真实人类专家提供决策支持）上的实验验证了我们的理论结果，并表明对齐可能带来更优决策。