Perception of offensiveness is inherently subjective, shaped by the lived experiences and socio-cultural values of the perceivers. Recent years have seen substantial efforts to build AI-based tools that can detect offensive language at scale, as a means to moderate social media platforms, and to ensure safety of conversational AI technologies such as ChatGPT and Bard. However, existing approaches treat this task as a technical endeavor, built on top of data annotated for offensiveness by a global crowd workforce without any attention to the crowd workers' provenance or the values their perceptions reflect. We argue that cultural and psychological factors play a vital role in the cognitive processing of offensiveness, which is critical to consider in this context. We re-frame the task of determining offensiveness as essentially a matter of moral judgment -- deciding the boundaries of ethically wrong vs. right language within an implied set of socio-cultural norms. Through a large-scale cross-cultural study based on 4309 participants from 21 countries across 8 cultural regions, we demonstrate substantial cross-cultural differences in perceptions of offensiveness. More importantly, we find that individual moral values play a crucial role in shaping these variations: moral concerns about Care and Purity are significant mediating factors driving cross-cultural differences. These insights are of crucial importance as we build AI models for the pluralistic world, where the values they espouse should aim to respect and account for moral values in diverse geo-cultural contexts.
翻译:冒犯性感知本质上是主观的,受感知者生活经历和社会文化价值观的影响。近年来,人们大力开发基于人工智能的大规模冒犯性语言检测工具,用于社交媒体平台的内容审核,并确保如ChatGPT和Bard等对话式AI技术的安全性。然而,现有方法将这一任务视为技术尝试,其基础是由全球众包劳动力标注的冒犯性数据,却未关注众包人员的背景或其感知所反映的价值观。我们认为,文化心理因素在冒犯性的认知处理中起着关键作用,这是在此背景下必须考虑的重要因素。我们将确定冒犯性这一任务重新定义为本质上关乎道德判断的问题——在隐含的社会文化规范中界定伦理上错误与正确语言的边界。通过一项基于来自8个文化区域、21个国家共4309名参与者的大规模跨文化研究,我们展示了冒犯性感知中显著的跨文化差异。更重要的是,我们发现个体道德价值观在塑造这些差异中起着关键作用:对关爱与纯洁的道德关切是驱动跨文化差异的重要中介因素。这些见解对于我们在多元化世界中构建AI模型至关重要——这些模型所倡导的价值观应旨在尊重并考虑不同地理文化背景下的道德价值观。