人机协同不确定性量化 (Human-AI Collaborative Uncertainty Quantification)

AI predictive systems are increasingly embedded in decision making pipelines, shaping high stakes choices once made solely by humans. Yet robust decisions under uncertainty still rely on capabilities that current AI lacks: domain knowledge not captured by data, long horizon context, and reasoning grounded in the physical world. This gap has motivated growing efforts to design collaborative frameworks that combine the complementary strengths of humans and AI. This work advances this vision by identifying the fundamental principles of Human AI collaboration within uncertainty quantification, a key component of reliable decision making. We introduce Human AI Collaborative Uncertainty Quantification, a framework that formalizes how an AI model can refine a human expert's proposed prediction set with two goals: avoiding counterfactual harm, ensuring the AI does not degrade correct human judgments, and complementarity, enabling recovery of correct outcomes the human missed. At the population level, we show that the optimal collaborative prediction set follows an intuitive two threshold structure over a single score function, extending a classical result in conformal prediction. Building on this insight, we develop practical offline and online calibration algorithms with provable distribution free finite sample guarantees. The online method adapts to distribution shifts, including human behavior evolving through interaction with AI, a phenomenon we call Human to AI Adaptation. Experiments across image classification, regression, and text based medical decision making show that collaborative prediction sets consistently outperform either agent alone, achieving higher coverage and smaller set sizes across various conditions.

翻译：人工智能预测系统正日益嵌入决策流程，影响着以往仅由人类做出的高风险选择。然而，在不确定性下的稳健决策仍依赖于当前AI所缺乏的能力：数据未能捕获的领域知识、长时域上下文以及基于物理世界的推理。这一差距推动了日益增长的研究努力，旨在设计结合人类与AI互补优势的协同框架。本文通过识别不确定性量化（可靠决策的关键组成部分）中人机协作的基本原理，推进了这一愿景。我们提出了人机协同不确定性量化框架，该框架形式化了AI模型如何通过两个目标来优化人类专家提出的预测集合：避免反事实损害（确保AI不会降低正确的人类判断）与互补性（能够恢复人类遗漏的正确结果）。在群体层面，我们证明最优协同预测集合遵循基于单一评分函数的直观双阈值结构，这扩展了保形预测中的经典结论。基于这一洞见，我们开发了具有可证明的分布无关有限样本保证的实用离线和在线校准算法。在线方法能够适应分布偏移，包括人类通过与AI交互而演化的行为（我们称之为人类对AI的适应）。在图像分类、回归和基于文本的医疗决策等任务上的实验表明，协同预测集合始终优于任何单一智能体，在各种条件下实现了更高的覆盖率和更小的集合规模。

相关内容

关注 7093

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日