Scientific publishing systematically filters out negative results. We argue that this long-standing asymmetry has become an urgent problem in the era of large language models, which inherit the positive bias of the literature they are trained on, face an impending shortage of high-quality training data, and are increasingly deployed as both research tools and peer reviewers. We analyze three ways in which LLMs have changed the value of failure data and show that the systematic absence of such data degrades their utility as research tools, training data consumers, and peer reviewers alike. We outline experimental protocols to validate these claims and discuss the structural conditions under which a failure-inclusive publishing culture could emerge.
翻译:科学出版系统性地过滤了负面结果。我们认为,这一长期存在的不对称性在大语言模型时代已演变为紧迫问题:这些模型继承了训练语料中的正向偏差,面临高质量训练数据即将短缺的困境,且正被越来越多地部署为研究工具和同行评审员。本文分析了LLMs改变失败数据价值的三种途径,证明此类数据的系统性缺失会降低其作为研究工具、训练数据消费者及同行评审员的效用。我们提出了验证这些论断的实验方案,并探讨了促使包容失败结果的出版文化形成的结构性条件。