Multi-choice questions (MCQs) serve as a common yet important task format in the research of large language models (LLMs). Our work shows that LLMs exhibit an inherent "selection bias" in MCQs, which refers to LLMs' preferences to select options located at specific positions (like "Option C"). This bias is prevalent across various LLMs, making their performance vulnerable to option position changes in MCQs. We identify that one primary cause resulting in selection bias is option numbering, i.e., the ID symbols A/B/C/D associated with the options. To mitigate selection bias, we propose a new method called PriDe. PriDe first decomposes the observed model prediction distribution into an intrinsic prediction over option contents and a prior distribution over option IDs. It then estimates the prior by permutating option contents on a small number of test samples, which is used to debias the subsequent test samples. We demonstrate that, as a label-free, inference-time method, PriDe achieves a more effective and computation-efficient debiasing than strong baselines. We further show that the priors estimated by PriDe generalize well across different domains, highlighting its practical potential in broader scenarios.
翻译:多选题(MCQs)是大型语言模型(LLMs)研究中常见且重要的任务形式。本研究表明,LLMs在多选题中固有地存在"选择偏差"——即模型倾向于选择特定位置选项(如"选项C")的偏好。这种偏差广泛存在于各类LLMs中,导致其性能易受选项位置变化影响。我们识别出导致选择偏差的主要成因之一是选项编号,即与选项关联的标识符号A/B/C/D。为缓解选择偏差,我们提出新方法PriDe。该方法首先将观测到的模型预测分布分解为对选项内容的内在预测分布和选项标识的先验分布,随后通过对少量测试样本进行选项内容置换来估计先验分布,并用于后续测试样本的去偏。我们证明,作为无标签推理时方法,PriDe比强基线实现了更高效且计算代价更低的去偏效果。进一步研究表明,PriDe估计的先验分布能良好泛化至不同领域,彰显其在更广泛场景中的实际应用潜力。