The Consensus Trap: Dissecting Subjectivity and the "Ground Truth" Illusion in Data Annotation

In machine learning, "ground truth" refers to the assumed correct labels used to train and evaluate models. However, the foundational "ground truth" paradigm rests on a positivistic fallacy that treats human disagreement as technical noise rather than a vital sociotechnical signal. This systematic literature review analyzes research published between 2020 and 2025 across seven premier venues: ACL, AIES, CHI, CSCW, EAAMO, FAccT, and NeurIPS, investigating the mechanisms in data annotation practices that facilitate this "consensus trap". Our identification phase captured 30,897 records, which were refined via a tiered keyword filtration schema to a high-recall corpus of 3,042 records for manual screening, resulting in a final included corpus of 346 papers for qualitative synthesis. Our reflexive thematic analysis reveals that systemic failures in positional legibility, combined with the recent architectural shift toward human-as-verifier models, specifically the reliance on model-mediated annotations, introduce deep-seated anchoring bias and effectively remove human voices from the loop. We further demonstrate how geographic hegemony imposes Western norms as universal benchmarks, often enforced by the performative alignment of precarious data workers who prioritize requester compliance over honest subjectivity to avoid economic penalties. Critiquing the "noisy sensor" fallacy, where statistical models misdiagnose cultural pluralism as random error, we argue for reclaiming disagreement as a high-fidelity signal essential for building culturally competent models. To address these systemic tensions, we propose a roadmap for pluralistic annotation infrastructures that shift the objective from discovering a singular "right" answer to mapping the diversity of human experience.

翻译：在机器学习中，"真实标签"指用于训练和评估模型的假定正确标签。然而，这一基础性的"真实标签"范式建立在实证主义谬误之上，它将人类分歧视为技术噪声而非至关重要的社会技术信号。本系统性文献综述分析了2020年至2025年间发表于七大顶级学术会议（ACL、AIES、CHI、CSCW、EAAMO、FAccT、NeurIPS）的研究，深入探讨数据标注实践中促成"共识陷阱"的机制。我们的识别阶段捕获了30,897条记录，通过分层关键词过滤方案将其精炼为包含3,042条记录的高召回语料库用于人工筛选，最终获得包含346篇论文的定性合成语料库。反思性主题分析表明：位置可读性的系统性失效，结合近期向"人类作为验证者"模型（特别是依赖模型介导标注）的架构转变，引入了根深蒂固的锚定偏见，实质上将人类声音排除在循环之外。我们进一步论证了地理霸权如何将西方规范强加为普适基准，这种霸权往往通过不稳定数据工作者的表演性对齐得以强化——他们为规避经济惩罚而优先满足请求方要求，而非表达真实主观判断。通过批判将文化多元性误诊为随机误差的"噪声传感器"谬误，我们主张将分歧重新定义为构建文化适配模型所必需的高保真信号。为应对这些系统性矛盾，我们提出了多元化标注基础设施的发展路线图，其核心目标将从寻找单一"正确"答案转向映射人类经验的多样性。