Large Language Models (LLMs) routinely infer users demographic traits from phrasing alone, which can result in biased responses, even when no explicit demographic information is provided. The role of disability cues in shaping these inferences remains largely uncharted. Thus, we present the first systematic audit of disability-conditioned demographic bias across eight state-of-the-art instruction-tuned LLMs ranging from 3B to 72B parameters. Using a balanced template corpus that pairs nine disability categories with six real-world business domains, we prompt each model to predict five demographic attributes - gender, socioeconomic status, education, cultural background, and locality - under both neutral and disability-aware conditions. Across a varied set of prompts, models deliver a definitive demographic guess in up to 97\% of cases, exposing a strong tendency to make arbitrary inferences with no clear justification. Disability context heavily shifts predicted attribute distributions, and domain context can further amplify these deviations. We observe that larger models are simultaneously more sensitive to disability cues and more prone to biased reasoning, indicating that scale alone does not mitigate stereotype amplification. Our findings reveal persistent intersections between ableism and other demographic stereotypes, pinpointing critical blind spots in current alignment strategies. We release our evaluation framework and results to encourage disability-inclusive benchmarking and recommend integrating abstention calibration and counterfactual fine-tuning to curb unwarranted demographic inference. Code and data will be released on acceptance.
翻译:大语言模型(LLMs)仅通过语言表达就能推断用户的群体特征,即使在未提供明确群体信息的情况下,也可能导致带有偏见的回答。残障相关线索如何影响此类推断,目前仍缺乏系统性研究。为此,我们首次对八种参数量从30亿到720亿不等的先进指令微调大语言模型,开展了针对残障情境下群体偏见的系统性审计。通过构建平衡的模板语料库——将九种残障类别与六个现实商业领域相结合,我们引导每个模型在"中性"和"残障感知"两种条件下,预测性别、社会经济地位、教育程度、文化背景和居住地这五项群体属性。在多样化的提示场景中,模型在高达97%的情况下会给出明确的群体属性猜测,暴露出其在缺乏合理依据时仍倾向于进行随意推断的强烈倾向。残障语境会显著改变预测的属性分布,而领域语境可能进一步放大这种偏差。我们发现,更大规模的模型同时对残障线索更敏感且更易产生偏见推理,这表明仅靠模型规模无法缓解刻板印象的放大效应。我们的研究揭示了能力歧视与其他群体刻板印象之间持续存在的交叉影响,指明了当前对齐策略中的关键盲区。我们将公开评估框架与结果,以促进包容残障群体的基准测试,并建议通过"弃权校准"和"反事实微调"来抑制无依据的群体推断。代码与数据将在论文录用后公开发布。