Multi-choice questions (MCQs) serve as a common yet important task format in the research of large language models (LLMs). Our work shows that LLMs exhibit an inherent "selection bias" in MCQs, which refers to LLMs' preferences to select options located at specific positions (like "Option C"). This bias is prevalent across various LLMs, making their performance vulnerable to option position changes in MCQs. We identify that one primary cause resulting in selection bias is option numbering, i.e., the ID symbols A/B/C/D associated with the options. To mitigate selection bias, we propose a new method called PriDe. PriDe first decomposes the observed model prediction distribution into an intrinsic prediction over option contents and a prior distribution over option IDs. It then estimates the prior by permutating option contents on a small number of test samples, which is used to debias the subsequent test samples. We demonstrate that, as a label-free, inference-time method, PriDe achieves a more effective and computation-efficient debiasing than strong baselines. We further show that the priors estimated by PriDe generalize well across different domains, highlighting its practical potential in broader scenarios.
翻译:多选题(MCQs)是大型语言模型(LLMs)研究中常见且重要的任务形式。我们的研究表明,LLMs在MCQs中表现出固有的“选择偏差”,即LLMs倾向于选择位于特定位置(如“选项C”)的选项。这种偏差广泛存在于各种LLMs中,使其性能容易受到MCQ中选项位置变化的影响。我们发现,导致选择偏差的一个主要原因是选项编号,即与选项关联的标识符号A/B/C/D。为缓解选择偏差,我们提出了一种名为PriDe的新方法。PriDe首先将观测到的模型预测分布分解为对选项内容的内在预测和对选项ID的先验分布;随后,它通过对少量测试样本进行选项内容置换来估计先验,并用其去偏后续测试样本。我们证明,作为一种无标签、推理时的方法,PriDe比强基线方法实现了更有效且计算高效的去偏效果。我们进一步表明,PriDe估计的先验在不同领域间具有良好的泛化能力,凸显了其在更广泛场景中的实际潜力。