Human-AI complementarity is the claim that a human supported by an AI system can outperform either alone in a decision-making process. Since its introduction in the human-AI interaction literature, it has gained traction by generalizing the reliance paradigm and by offering a more practical alternative to the contested construct of 'trust in AI.' Yet complementarity faces key theoretical challenges: it lacks precise theoretical anchoring, it is formalized just as a post hoc indicator of relative predictive accuracy, it remains silent about other desiderata of human-AI interactions and it abstracts away from the magnitude-cost profile of its performance gain. As a result, complementarity is difficult to obtain in empirical settings. In this work, we leverage epistemology to address these challenges by reframing complementarity within the discourse on justificatory AI. Drawing on computational reliabilism, we argue that historical instances of complementarity function as evidence that a given human-AI interaction is a reliable epistemic process for a given predictive task. Together with other reliability indicators assessing the alignment of the human-AI team with the epistemic standards and socio-technical practices, complementarity contributes to the degree of reliability of human-AI teams when generating predictions. This supports the practical reasoning of those affected by these outputs -- patients, managers, regulators, and others. In summary, our approach suggests that the role and value of complementarity lies not in providing a relative measure of predictive accuracy, but in helping calibrate decision-making to the reliability of AI-supported processes that increasingly shape everyday life.
翻译:人机互补性主张认为,在决策过程中,由人工智能系统支持的人类表现可以超越任何一方单独行动。自该概念引入人机交互研究领域以来,其通过推广依赖范式、并为备受争议的“人工智能信任”概念提供更实用的替代方案而获得广泛关注。然而,互补性面临若干关键理论挑战:缺乏精确的理论锚点,仅被形式化为事后相对预测准确性的指标,未涉及人机交互的其他理想特性,且忽略了其性能增益的幅度-成本特征。因此,在实证环境中实现互补性十分困难。本研究借助认识论框架,通过将互补性重新置于可解释人工智能的论述中来解决这些挑战。基于计算可靠性理论,我们认为互补性的历史实例可作为证据,表明特定的人机交互对于给定预测任务是可靠的认知过程。结合其他评估人机团队与认知标准及社会技术实践契合度的可靠性指标,互补性共同构成了人机团队生成预测时的可靠性程度。这为受这些输出影响的主体——患者、管理者、监管者等——提供了实践推理的依据。总之,我们的研究路径表明,互补性的作用与价值不在于提供预测准确性的相对度量,而在于帮助决策者校准对日益影响日常生活的人工智能支持过程的可靠性认知。