Human-Aligned Calibration for AI-Assisted Decision Making

Whenever a binary classifier is used to provide decision support, it typically provides both a label prediction and a confidence value. Then, the decision maker is supposed to use the confidence value to calibrate how much to trust the prediction. In this context, it has been often argued that the confidence value should correspond to a well calibrated estimate of the probability that the predicted label matches the ground truth label. However, multiple lines of empirical evidence suggest that decision makers have difficulties at developing a good sense on when to trust a prediction using these confidence values. In this paper, our goal is first to understand why and then investigate how to construct more useful confidence values. We first argue that, for a broad class of utility functions, there exist data distributions for which a rational decision maker is, in general, unlikely to discover the optimal decision policy using the above confidence values -- an optimal decision maker would need to sometimes place more (less) trust on predictions with lower (higher) confidence values. However, we then show that, if the confidence values satisfy a natural alignment property with respect to the decision maker's confidence on her own predictions, there always exists an optimal decision policy under which the level of trust the decision maker would need to place on predictions is monotone on the confidence values, facilitating its discoverability. Further, we show that multicalibration with respect to the decision maker's confidence on her own predictions is a sufficient condition for alignment. Experiments on four different AI-assisted decision making tasks where a classifier provides decision support to real human experts validate our theoretical results and suggest that alignment may lead to better decisions.

翻译：每当二元分类器被用于提供决策支持时，它通常同时输出标签预测和置信度值。随后，决策者应利用该置信度值来校准对预测的信任程度。在此背景下，学界普遍认为置信度值应准确反映预测标签与真实标签匹配的概率估计。然而，多项实证研究表明，决策者难以通过这类置信度值形成良好的信任判断。本文首先旨在理解这一困境的成因，进而探讨如何构建更具实用性的置信度值。我们首先论证：对于一大类效用函数而言，存在某些数据分布使得理性决策者通常无法通过上述置信度值发现最优决策策略——理想决策者有时需要对低置信度预测赋予更高（更低）的信任度。但随后我们证明：若置信度值与决策者对其自身预测的置信度满足自然对齐属性，则总存在一种最优决策策略，使得决策者对预测所需信任程度与置信度值保持单调关系，从而提升策略的可发现性。进一步研究表明，基于决策者自身预测置信度的多校准性（multicalibration）是实现对齐的充分条件。在四项涉及真实人类专家的AI辅助决策任务（分类器为专家提供决策支持）上的实验验证了我们的理论结果，并表明对齐可能带来更优的决策效果。