Automated decision systems (ADS) are broadly deployed to inform and support human decision-making across a wide range of consequential settings. However, various context-specific details complicate the goal of establishing meaningful experimental evaluations for prediction-based interventions. Notably, current experiment designs rely on simplifying assumptions about human decision making in order to derive causal estimates. In reality, specific experimental design decisions may induce cognitive biases in human decision makers, which could then significantly alter the observed effect sizes of the prediction intervention. In this paper, we formalize and investigate various models of human decision-making in the presence of a predictive model aid. We show that each of these behavioural models produces dependencies across decision subjects and results in the violation of existing assumptions, with consequences for treatment effect estimation. This work aims to further advance the scientific validity of intervention-based evaluation schemes for the assessment of ADS deployments.
翻译:自动化决策系统(ADS)被广泛应用于各类关键场景中,为人类决策提供信息与支持。然而,针对基于预测的干预措施建立有效的实验评估,受到诸多具体情境细节的复杂影响。值得注意的是,当前实验设计依赖于对人类决策行为的简化假设以推导因果估计。实际上,特定的实验设计决策可能诱发人类决策者的认知偏差,从而显著改变预测干预的观测效应量。本文形式化并研究了在预测模型辅助下多种人类决策模型。我们证明,这些行为模型均会在决策对象间产生依赖性,并导致现有假设的违背,进而影响处理效应的估计。本研究旨在进一步提升基于干预的评估方案在ADS部署评估中的科学有效性。