Human-Aligned Calibration for AI-Assisted Decision Making

Whenever a binary classifier is used to provide decision support, it typically provides both a label prediction and a confidence value. Then, the decision maker is supposed to use the confidence value to calibrate how much to trust the prediction. In this context, it has been often argued that the confidence value should correspond to a well calibrated estimate of the probability that the predicted label matches the ground truth label. However, multiple lines of empirical evidence suggest that decision makers have difficulties at developing a good sense on when to trust a prediction using these confidence values. In this paper, our goal is first to understand why and then investigate how to construct more useful confidence values. We first argue that, for a broad class of utility functions, there exist data distributions for which a rational decision maker is, in general, unlikely to discover the optimal decision policy using the above confidence values -- an optimal decision maker would need to sometimes place more (less) trust on predictions with lower (higher) confidence values. However, we then show that, if the confidence values satisfy a natural alignment property with respect to the decision maker's confidence on her own predictions, there always exists an optimal decision policy under which the level of trust the decision maker would need to place on predictions is monotone on the confidence values, facilitating its discoverability. Further, we show that multicalibration with respect to the decision maker's confidence on her own predictions is a sufficient condition for alignment. Experiments on four different AI-assisted decision making tasks where a classifier provides decision support to real human experts validate our theoretical results and suggest that alignment may lead to better decisions.

翻译：每当二元分类器被用于提供决策支持时，它通常同时输出标签预测和置信度值。随后，决策者应利用置信度值来校准对预测的信任程度。在此背景下，学界常主张置信度值应等同于预测标签与真实标签匹配概率的良好校准估计。然而，多组实证证据表明，决策者难以通过此类置信度值来准确判断何时信任预测。本文旨在首先理解这一困境的成因，继而探究如何构建更具实用性的置信度值。我们首先论证：对一类广泛的效用函数而言，存在某些数据分布使得理性决策者通常无法通过上述置信度值发现最优决策策略——最优决策者有时需要对较低置信度值的预测给予更高（或更低）的信任。但随后我们证明：若置信度值相对于决策者自身预测的置信度满足自然对齐性质，则始终存在一个最优决策策略，使得决策者对预测所需的信任程度与置信度值单调相关，从而便于该策略的发现。进一步研究表明，相对于决策者自身预测置信度的多校准性是对齐的充分条件。在四项由分类器为真实人类专家提供决策支持的AI辅助决策任务上的实验验证了我们的理论结果，并表明对齐可能有助于实现更优的决策。