How well people use information displays to make decisions is of primary interest in human-centered AI, model explainability, data visualization, and related areas. However, what constitutes a decision problem, and what is required for a study to establish that human decisions could be improved remain open to speculation. We propose a widely applicable definition of a decision problem synthesized from statistical decision theory and information economics as a standard for establishing when human decisions can be improved in HCI. We argue that to attribute loss in human performance to forms of bias, an experiment must provide participants with the information that a rational agent would need to identify the utility-maximizing decision. As a demonstration, we evaluate the extent to which recent evaluations of decision-making from the literature on AI-assisted decisions achieve these criteria. We find that only 10 (26\%) of 39 studies that claim to identify biased behavior present participants with sufficient information to characterize their behavior as deviating from good decision-making in at least one treatment condition. We motivate the value of studying well-defined decision problems by describing a characterization of performance losses they allow us to conceive. In contrast, the ambiguities of a poorly communicated decision problem preclude normative interpretation. We conclude with recommendations for practice.
翻译:人们如何利用信息展示做出决策,是人类中心人工智能、模型可解释性、数据可视化及相关领域的核心关注点。然而,关于什么构成一个决策问题,以及一项研究需要满足哪些条件才能证实人类决策存在改进空间,目前仍存在诸多推测。我们基于统计决策理论与信息经济学,提出一个广泛适用的决策问题定义,作为人机交互领域判断人类决策是否可改进的标准。我们认为,若要将人类表现中的损失归因于某种认知偏差,实验必须为参与者提供理性主体识别效用最大化决策所需的信息。作为例证,我们评估了近期人工智能辅助决策文献中相关实验评估在多大程度上满足这些标准。我们发现,在39项声称识别出偏差行为的研究中,仅有10项(26%)在至少一种实验条件下为参与者提供了足以将其行为定性为偏离良好决策的信息。我们通过阐述明确定义的决策问题所能揭示的性能损失特征,论证了研究此类问题的价值。相比之下,定义模糊的决策问题会阻碍规范性解读。最后,我们提出了实践建议。