Training-Induced Bias Toward LLM-Generated Content in Dense Retrieval

Dense retrieval is a promising approach for acquiring relevant context or world knowledge in open-domain natural language processing tasks and is now widely used in information retrieval applications. However, recent reports claim a broad preference for text generated by large language models (LLMs). This bias is called "source bias", and it has been hypothesized that lower perplexity contributes to this effect. In this study, we revisit this claim by conducting a controlled evaluation to trace the emergence of such preferences across training stages and data sources. Using parallel human- and LLM-generated counterparts of the SciFact and Natural Questions (NQ320K) datasets, we compare unsupervised checkpoints with models fine-tuned using in-domain human text, in-domain LLM-generated text, and MS MARCO. Our results show the following: 1) Unsupervised retrievers do not exhibit a uniform pro-LLM preference. The direction and magnitude depend on the dataset. 2) Across the settings tested, supervised fine-tuning on MS MARCO consistently shifts the rankings toward LLM-generated text. 3) In-domain fine-tuning produces dataset-specific and inconsistent shifts in preference. 4) Fine-tuning on LLM-generated corpora induces a pronounced pro-LLM bias. Finally, a retriever-centric perplexity probe involving the reattachment of a language modeling head to the fine-tuned dense retriever encoder indicates agreement with relevance near chance, thereby weakening the explanatory power of perplexity. Our study demonstrates that source bias is a training-induced phenomenon rather than an inherent property of dense retrievers.

翻译：密集检索是在开放域自然语言处理任务中获取相关语境或世界知识的一种有效方法，目前已广泛应用于信息检索领域。然而，近期研究指出，模型普遍更倾向于选择由大语言模型（LLM）生成的文本。这种偏差被称为“来源偏差”，且已有假设认为较低的困惑度是导致该现象的原因。本研究通过设计受控实验，追踪不同训练阶段与数据来源中此类偏好的形成过程，从而重新审视这一论断。利用SciFact和Natural Questions（NQ320K）数据集中人工生成与LLM生成的平行文本对，我们对比了无监督检查点模型与使用领域内人工文本、领域内LLM生成文本以及MS MARCO数据集进行微调的模型。实验结果如下：1）无监督检索模型并未表现出统一的亲LLM偏好，其偏向的方向与程度因数据集而异；2）在所有测试设定中，基于MS MARCO的监督微调均持续使排序结果偏向LLM生成文本；3）领域内微调产生的偏好偏移具有数据集特异性且不一致；4）在LLM生成语料上进行微调会引发显著的亲LLM偏差。最后，通过将语言建模头重新接入微调后的密集检索编码器进行以检索器为中心的困惑度探测实验，结果显示其与相关性的匹配程度接近随机水平，从而削弱了困惑度的解释力。本研究表明，来源偏差是一种训练诱导现象，而非密集检索器固有的属性。