Labels or Preferences? Budget-Constrained Learning with Human Judgments over AI-Generated Outputs

The increasing reliance on human preference feedback to judge AI-generated pseudo labels has created a pressing need for principled, budget-conscious data acquisition strategies. We address the crucial question of how to optimally allocate a fixed annotation budget between ground-truth labels and pairwise preferences in AI. Our solution, grounded in semi-parametric inference, casts the budget allocation problem as a monotone missing data framework. Building on this formulation, we introduce Preference-Calibrated Active Learning (PCAL), a novel method that learns the optimal data acquisition strategy and develops a statistically efficient estimator for functionals of the data distribution. Theoretically, we prove the asymptotic optimality of our PCAL estimator and establish a key robustness guarantee that ensures robust performance even with poorly estimated nuisance models. Our flexible framework applies to a general class of problems, by directly optimizing the estimator's variance instead of requiring a closed-form solution. This work provides a principled and statistically efficient approach for budget-constrained learning in modern AI. Simulations and real-data analysis demonstrate the practical benefits and superior performance of our proposed method.

翻译：随着人类偏好反馈在评判AI生成的伪标签方面日益受到依赖，制定原则性且预算敏感的数据获取策略变得尤为迫切。本文探讨了一个关键问题：在人工智能领域，如何将固定的标注预算最优地分配给真实标签和成对偏好。我们的解决方案基于半参数推断，将预算分配问题构建为一个单调缺失数据框架。基于此框架，我们提出了偏好校准主动学习（PCAL），这是一种新颖的方法，既能学习最优的数据获取策略，又能为数据分布的函数构建统计高效的估计量。理论上，我们证明了PCAL估计量的渐近最优性，并建立了一个关键的鲁棒性保证，确保即使在辅助模型估计不佳的情况下也能保持稳健性能。我们的灵活框架适用于一大类问题，它通过直接优化估计量的方差，而非要求闭式解来实现。这项工作为现代人工智能中的预算约束学习提供了一种原则性且统计高效的方法。仿真和实际数据分析证明了我们所提方法的实用优势与优越性能。

相关内容

关注 7109

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

【ICML2025】通过概念对齐与混淆感知校准边界处理视觉-语言模型中的伪标签不平衡问题

专知会员服务

11+阅读 · 2025年5月6日

《战斗决策中的人工智能：基于强化学习和图神经网络的武器目标分配》

专知会员服务

124+阅读 · 2024年10月11日