Prior research has demonstrated that language models can, to a limited extent, represent moral norms in a variety of cultural contexts. This research aims to replicate these findings and further explore their validity, concentrating on issues like 'homosexuality' and 'divorce'. This study evaluates the effectiveness of these models using information from two surveys, the WVS and the PEW, that encompass moral perspectives from over 40 countries. The results show that biases exist in both monolingual and multilingual models, and they typically fall short of accurately capturing the moral intricacies of diverse cultures. However, the BLOOM model shows the best performance, exhibiting some positive correlations, but still does not achieve a comprehensive moral understanding. This research underscores the limitations of current PLMs in processing cross-cultural differences in values and highlights the importance of developing culturally aware AI systems that better align with universal human values.
翻译:先前的研究表明,语言模型能够在有限程度上表征不同文化背景下的道德规范。本研究旨在复现这些发现,并进一步探讨其有效性,重点关注“同性恋”与“离婚”等议题。本研究利用涵盖40多个国家道德观念的两项调查数据——世界价值观调查(WVS)与皮尤研究中心(PEW)调查——来评估这些模型的表现。结果表明,单语与多语言模型均存在偏见,且通常难以准确捕捉不同文化中道德的复杂性。然而,BLOOM模型表现最佳,显示出一定的正相关性,但仍未实现全面的道德理解。本研究强调了当前预训练语言模型在处理跨文化价值差异方面的局限性,并指出开发更具文化意识、更符合普世人类价值的人工智能系统的重要性。