Fine-tuning has been demonstrated to be an effective method to improve the domain performance of large language models (LLMs). However, LLMs might fit the dataset bias and shortcuts for prediction, leading to poor generation performance. Experimental result shows that LLMs are prone to exhibit position bias, i.e., leveraging information positioned at the beginning or end, or specific positional cues within the input. Existing works on mitigating position bias require external bias knowledge or annotated non-biased samples, which is unpractical in reality. In this work, we propose a zero-shot position debiasing (ZOE) framework to mitigate position bias for LLMs. ZOE leverages unsupervised responses from pre-trained LLMs for debiasing, thus without any external knowledge or datasets. To improve the quality of unsupervised responses, we propose a master-slave alignment (MSA) module to prune these responses. Experiments on eight datasets and five tasks show that ZOE consistently outperforms existing methods in mitigating four types of position biases. Besides, ZOE achieves this by sacrificing only a small performance on biased samples, which is simple and effective.
翻译:微调已被证明是提升大语言模型(LLMs)领域性能的有效方法。然而,LLMs 可能拟合数据集中的偏差和用于预测的捷径,导致生成性能不佳。实验结果表明,LLMs 容易表现出位置偏差,即利用输入开头或结尾的信息,或特定位置线索。现有缓解位置偏差的工作需要外部偏差知识或标注的无偏样本,这在现实中难以实现。在本工作中,我们提出了一种零样本位置去偏(ZOE)框架,用于缓解LLMs的位置偏差。ZOE 利用预训练LLMs的无监督响应进行去偏,因此无需任何外部知识或数据集。为提升无监督响应的质量,我们提出了一种主从对齐(MSA)模块来修剪这些响应。在八个数据集和五个任务上的实验表明,ZOE 在缓解四种类型的位置偏差方面始终优于现有方法。此外,ZOE 仅以牺牲偏差样本上的少量性能为代价实现这一目标,简单而有效。