Large language models (LLMs) are known to produce varying responses depending on prompt phrasing, indicating that subtle guidance in phrasing can steer their answers. However, the impact of this framing bias on LLM-based evaluation, where models are expected to make stable and impartial judgments, remains largely underexplored. Drawing inspiration from the framing effect in psychology, we systematically investigate how deliberate prompt framing skews model judgments across four high-stakes evaluation tasks. We design symmetric prompts using predicate-positive and predicate-negative constructions and demonstrate that such framing induces significant discrepancies in model outputs. Across 14 LLM judges, we observe clear susceptibility to framing, with model families showing distinct tendencies toward agreement or rejection. These findings suggest that framing bias is a structural property of current LLM-based evaluation systems, underscoring the need for framing-aware protocols.
翻译:大型语言模型(LLM)已知会根据提示措辞产生不同的响应,这表明措辞中的微妙引导能够影响其答案。然而,这种框架偏差对基于LLM的评估的影响——在此类评估中模型被期望做出稳定且公正的判断——在很大程度上仍未得到充分探索。受心理学中框架效应的启发,我们系统地研究了在四个高风险评估任务中,有意的提示框架如何扭曲模型判断。我们使用谓词肯定和谓词否定结构设计了对称提示,并证明此类框架会导致模型输出出现显著差异。在14个LLM评判模型中,我们观察到其对框架的明显敏感性,不同模型系列表现出倾向于同意或拒绝的明显趋势。这些发现表明,框架偏差是当前基于LLM的评估系统的一种结构性特征,强调了制定考虑框架影响的评估协议的必要性。