Fine-tuning has been demonstrated to be an effective method to improve the domain performance of large language models (LLMs). However, LLMs might fit the dataset bias and shortcuts for prediction, leading to poor generation performance. Previous works have proven that LLMs are prone to exhibit position bias, i.e., leveraging information positioned at the beginning or end, or specific positional cues within the input. Existing debiasing methods for LLMs require external bias knowledge or annotated non-biased samples, which is lacking for position debiasing and impractical in reality. In this work, we propose a self-supervised position debiasing (SOD) framework to mitigate position bias for LLMs. SOD leverages unsupervised responses from pre-trained LLMs for debiasing without relying on any external knowledge. To improve the quality of unsupervised responses, we propose an objective alignment (OAM) module to prune these responses. Experiments on eight datasets and five tasks show that SOD consistently outperforms existing methods in mitigating three types of position biases. Besides, SOD achieves this by sacrificing only a small performance on biased samples, which is general and effective. To facilitate the reproducibility of the results, we share the code of all methods and datasets on https://github.com/LZKSKY/SOD.
翻译:微调已被证明是提升大语言模型(LLMs)领域性能的有效方法。然而,LLMs 可能拟合数据集偏差和预测捷径,导致生成性能不佳。先前研究证明,LLMs 容易表现出位置偏差,即倾向于利用输入开头或结尾的信息,或特定的位置线索。现有的 LLMs 去偏方法需要外部偏差知识或标注的无偏样本,这在位置去偏任务中既缺乏又不切实际。本文提出一种自监督位置去偏(SOD)框架,以减轻 LLMs 的位置偏差。SOD 利用预训练 LLMs 的无监督响应进行去偏,无需依赖任何外部知识。为提升无监督响应的质量,我们提出目标对齐(OAM)模块对这些响应进行剪枝。在八个数据集和五项任务上的实验表明,SOD 在缓解三种类型的位置偏差方面持续优于现有方法。此外,SOD 仅以牺牲少量有偏样本上的性能为代价实现这一目标,具有普适性和有效性。为促进结果的可复现性,我们在 https://github.com/LZKSKY/SOD 上公开了所有方法和数据集的代码。