Position bias has proven to be a prevalent issue of modern language models (LMs), where the models prioritize content based on its position within the given context. This bias often leads to unexpected model failures and hurts performance, robustness, and reliability across various applications. Our mechanistic analysis attributes the position bias to two components employed in nearly all state-of-the-art LMs: causal attention and relative positional encodings. Based on the analyses, we propose to eliminate position bias (e.g., different retrieved documents' orders in QA affect performance) with a training-free zero-shot approach. Our method changes the causal attention to bidirectional attention between documents and utilizes model attention values to decide the relative orders of documents instead of using the order provided in input prompts, therefore enabling Position-INvariant inferencE (PINE) at the document level. By eliminating position bias, models achieve better performance and reliability in downstream tasks, including LM-as-a-judge, retrieval-augmented QA, molecule generation, and math reasoning. Notably, PINE is especially useful when adapting LMs for evaluating reasoning pairs: it consistently provides 8 to 10 percentage points performance gains, making Llama-3-70B-Instruct perform even better than GPT-4-0125-preview and GPT-4o-2024-08-06 on the RewardBench reasoning set.
翻译:位置偏见已被证明是现代语言模型(LMs)中普遍存在的问题,即模型会根据内容在给定上下文中的位置来优先处理它们。这种偏见常常导致意外的模型失败,并损害各种应用中的性能、鲁棒性和可靠性。我们的机制性分析将位置偏见归因于几乎所有最先进语言模型都采用的两个组件:因果注意力和相对位置编码。基于这些分析,我们提出了一种无需训练、零样本的方法来消除位置偏见(例如,在问答任务中不同检索文档的顺序会影响性能)。我们的方法将文档间的因果注意力改为双向注意力,并利用模型的注意力值来决定文档的相对顺序,而不是使用输入提示中提供的顺序,从而在文档层面实现了位置不变推理(PINE)。通过消除位置偏见,模型在下游任务中实现了更好的性能和可靠性,包括语言模型即评判、检索增强问答、分子生成和数学推理。值得注意的是,PINE在将语言模型用于评估推理对时尤其有用:它能持续带来8到10个百分点的性能提升,使得Llama-3-70B-Instruct在RewardBench推理集上的表现甚至优于GPT-4-0125-preview和GPT-4o-2024-08-06。