Automated decision systems (ADS) are broadly deployed to inform and support human decision-making across a wide range of consequential settings. However, various context-specific details complicate the goal of establishing meaningful experimental evaluations for prediction-based interventions. Notably, current experiment designs rely on simplifying assumptions about human decision making in order to derive causal estimates. In reality, specific experimental design decisions may induce cognitive biases in human decision makers, which could then significantly alter the observed effect sizes of the prediction intervention. In this paper, we formalize and investigate various models of human decision-making in the presence of a predictive model aid. We show that each of these behavioural models produces dependencies across decision subjects and results in the violation of existing assumptions, with consequences for treatment effect estimation. This work aims to further advance the scientific validity of intervention-based evaluation schemes for the assessment of ADS deployments.
翻译:自动化决策系统(ADS)已在众多关键场景中被广泛部署,用于辅助和支持人类决策。然而,针对基于预测的干预措施建立有效的实验评估,受到各种情境特定细节的复杂影响。值得注意的是,现有实验设计依赖于对人类决策过程的简化假设以推导因果估计。实际上,特定的实验设计决策可能会引发人类决策者的认知偏差,从而显著改变预测干预的观测效应大小。本文形式化并探讨了在预测模型辅助下人类决策的多种行为模型。我们证明,这些行为模型均会在决策主体间产生依赖性,并导致现有假设的违反,进而影响处理效应的估计。本研究旨在进一步提升基于干预的评估方案在ADS部署评估中的科学有效性。