The rapid adoption of Small Language Models (SLMs) for resource constrained applications has outpaced our understanding of their ethical and fairness implications. To address this gap, we introduce the Vacuous Neutrality Framework (VaNeu), a multi-dimensional evaluation paradigm designed to assess SLM fairness prior to deployment. The framework examines model robustness across four stages - biases, utility, ambiguity handling, and positional bias over diverse social bias categories. To the best of our knowledge, this work presents the first large-scale audit of SLMs in the 0.5-5B parameter range, an overlooked "middle tier" between BERT-class encoders and flagship LLMs. We evaluate nine widely used SLMs spanning four model families under both ambiguous and disambiguated contexts. Our findings show that models demonstrating low bias in early stages often fail subsequent evaluations, revealing hidden vulnerabilities and unreliable reasoning. These results underscore the need for a more comprehensive understanding of fairness and reliability in SLMs, and position the proposed framework as a principled tool for responsible deployment in socially sensitive settings.
翻译:小型语言模型(SLMs)在资源受限应用中的快速普及已超越了我们对其伦理与公平性影响的理解。为填补这一空白,我们提出了空洞中立性框架(VaNeu),这是一个多维度的评估范式,旨在SLM部署前评估其公平性。该框架从四个阶段——偏差、效用、模糊性处理和跨社会偏见类别的立场偏差——检验模型的鲁棒性。据我们所知,本研究首次对0.5-5B参数范围内的SLMs进行了大规模审计,这一介于BERT类编码器与旗舰LLMs之间的“中间层”此前被忽视。我们在模糊与消歧两种语境下评估了涵盖四个模型系列的九种常用SLM。研究结果表明,在早期阶段表现出低偏差的模型往往在后续评估中失效,揭示了隐藏的脆弱性与不可靠的推理。这些结果强调了对SLMs公平性与可靠性更全面理解的必要性,并将所提框架定位为社会敏感场景中负责任部署的原则性工具。