Controlling Counterfactual Harm in Decision Support Systems Based on Prediction Sets

Decision support systems based on prediction sets help humans solve multiclass classification tasks by narrowing down the set of potential label values to a subset of them, namely a prediction set, and asking them to always predict label values from the prediction sets. While this type of systems have been proven to be effective at improving the average accuracy of the predictions made by humans, by restricting human agency, they may cause harm$\unicode{x2014}$a human who has succeeded at predicting the ground-truth label of an instance on their own may have failed had they used these systems. In this paper, our goal is to control how frequently a decision support system based on prediction sets may cause harm, by design. To this end, we start by characterizing the above notion of harm using the theoretical framework of structural causal models. Then, we show that, under a natural, albeit unverifiable, monotonicity assumption, we can estimate how frequently a system may cause harm using only predictions made by humans on their own. Further, we also show that, under a weaker monotonicity assumption, which can be verified experimentally, we can bound how frequently a system may cause harm again using only predictions made by humans on their own. Building upon these assumptions, we introduce a computational framework to design decision support systems based on prediction sets that are guaranteed to cause harm less frequently than a user-specified value using conformal risk control. We validate our framework using real human predictions from two different human subject studies and show that, in decision support systems based on prediction sets, there is a trade-off between accuracy and counterfactual harm.

翻译：基于预测集的决策支持系统通过将潜在标签值的集合缩小至其子集（即预测集），并要求人类始终从预测集中预测标签值，来协助人类解决多类别分类任务。尽管此类系统已被证明能有效提升人类预测的平均准确率，但通过限制人类自主性，它们可能造成危害——即原本能独立正确预测实例真实标签的人类，若使用此类系统反而可能预测失败。本文的目标是通过设计，控制基于预测集的决策支持系统造成危害的频率。为此，我们首先利用结构因果模型的理论框架对上述危害概念进行形式化表征。随后证明，在一个自然但不可验证的单调性假设下，我们可以仅利用人类独立完成的预测来估计系统可能造成危害的频率。进一步地，我们还证明，在一个更弱且可通过实验验证的单调性假设下，我们同样可以仅基于人类独立预测来界定系统造成危害的频率上限。基于这些假设，我们引入一个计算框架，利用共形风险控制来设计基于预测集的决策支持系统，确保其造成危害的频率低于用户指定阈值。我们通过两项不同人类受试者研究的真实预测数据验证了该框架，并表明在基于预测集的决策支持系统中，准确性与反事实危害之间存在权衡关系。

相关内容

DSS

关注 479

决策支持系统（Decision Support Systems）期刊中发表的文章的共同主线是它们与支持增强决策制定的理论和技术问题的相关性。所涉及的领域可能包括基础、功能、接口、实现、影响和决策支持系统(DSS)的评估。手稿可以从不同的方法和方法学中获得，包括决策理论、经济学、计量经济学、统计学、计算机支持的协作工作、数据库管理、语言学、管理科学、数学建模、运营管理、认知科学、心理学、用户界面管理等。但是，一份侧重于对任何这些相关领域的直接贡献的手稿应提交给适合于特定领域的机构。官网地址：http://dblp.uni-trier.de/db/journals/dss/

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日