We introduce a novel framework for incorporating human expertise into algorithmic predictions. Our approach focuses on the use of human judgment to distinguish inputs which `look the same' to any feasible predictive algorithm. We argue that this framing clarifies the problem of human/AI collaboration in prediction tasks, as experts often have access to information -- particularly subjective information -- which is not encoded in the algorithm's training data. We use this insight to develop a set of principled algorithms for selectively incorporating human feedback only when it improves the performance of any feasible predictor. We find empirically that although algorithms often outperform their human counterparts on average, human judgment can significantly improve algorithmic predictions on specific instances (which can be identified ex-ante). In an X-ray classification task, we find that this subset constitutes nearly 30% of the patient population. Our approach provides a natural way of uncovering this heterogeneity and thus enabling effective human-AI collaboration.
翻译:我们提出了一种新颖的框架,用于将人类专长融入算法预测。该方法聚焦于利用人类判断来区分那些对任何可行预测算法而言“看起来相同”的输入。我们认为,这一框架澄清了预测任务中人机协作的问题,因为专家往往能够获取算法训练数据中未编码的信息,尤其是主观信息。基于这一洞见,我们开发了一套原则性的算法,仅在能够提升任何可行预测器性能时,选择性地纳入人类反馈。实验发现,尽管算法在平均表现上通常优于人类同行,但在特定实例(可事先识别)上,人类判断能显著改善算法预测。在一个X光分类任务中,我们发现这类实例占患者群体的近30%。我们的方法为揭示这种异质性提供了一条自然路径,从而促成有效的人机协作。