Given that language models are trained on vast datasets that may contain inherent biases, there is a potential danger of inadvertently perpetuating systemic discrimination. Consequently, it becomes essential to examine and address biases in language models, integrating fairness into their development to ensure these models are equitable and free from bias. In this work, we demonstrate the importance of reasoning in zero-shot stereotype identification based on Vicuna-13B-v1.3. While we do observe improved accuracy by scaling from 13B to 33B, we show that the performance gain from reasoning significantly exceeds the gain from scaling up. Our findings suggest that reasoning could be a key factor that enables LLMs to trescend the scaling law on out-of-domain tasks such as stereotype identification. Additionally, through a qualitative analysis of select reasoning traces, we highlight how reasoning enhances not just accuracy but also the interpretability of the decision.
翻译:鉴于语言模型是在可能包含固有偏见的海量数据集上训练的,因此存在无意中延续系统性歧视的潜在危险。因此,检查并解决语言模型中的偏见、将公平性融入其开发过程中,以确保这些模型公平且无偏见至关重要。在本研究中,我们基于Vicuna-13B-v1.3论证了推理在零样本刻板印象识别中的重要性。虽然我们观察到通过将模型规模从13B扩展到33B可以提高准确率,但我们发现,通过推理带来的性能提升显著超过了规模扩展带来的增益。我们的研究结果表明,推理可能是使大语言模型在刻板印象识别等域外任务上超越缩放定律的关键因素。此外,通过对选定推理轨迹的定性分析,我们强调了推理不仅提升了准确率,还增强了决策的可解释性。