Large Language Models (LLMs) are increasingly deployed across diverse real-world applications and user communities. As such, it is crucial that these models remain both morally grounded and knowledge-aware. In this work, we uncover a critical limitation of current LLMs -- their tendency to prioritize moral reasoning over commonsense understanding. To investigate this phenomenon, we introduce CoMoral, a novel benchmark dataset containing commonsense contradictions embedded within moral dilemmas. Through extensive evaluation of ten LLMs across different model sizes, we find that existing models consistently struggle to identify such contradictions without prior signal. Furthermore, we observe a pervasive narrative focus bias, wherein LLMs more readily detect commonsense contradictions when they are attributed to a secondary character rather than the primary (narrator) character. Our comprehensive analysis underscores the need for enhanced reasoning-aware training to improve the commonsense robustness of large language models.
翻译:大型语言模型(LLMs)正日益广泛地部署于多样化的现实应用和用户群体中。因此,确保这些模型既保持道德根基又具备知识感知能力至关重要。在本研究中,我们揭示了当前LLMs的一个关键局限——其倾向于优先考虑道德推理而忽视常识理解。为探究这一现象,我们提出了CoMoral,这是一个包含嵌入道德困境中的常识矛盾的新型基准数据集。通过对十种不同规模的LLMs进行广泛评估,我们发现现有模型在没有先验信号的情况下持续难以识别此类矛盾。此外,我们观察到一种普遍存在的叙事焦点偏差:当常识矛盾被归因于次要角色而非主要(叙述者)角色时,LLMs更容易检测到这些矛盾。我们的综合分析强调了需要加强推理感知训练以提升大语言模型的常识鲁棒性。