A Unified Framework to Quantify Cultural Intelligence of AI

Sunipa Dev,Vinodkumar Prabhakaran,Rutledge Chin Feman,Aida Davani,Remi Denton,Charu Kalia,Piyawat Lertvittayakumjorn,Madhurima Maji,Rida Qadri,Negar Rostamzadeh,Renee Shelby,Romina Stella,Hayk Stepanyan,Erin van Liemt,Aishwarya Verma,Oscar Wahltinez,Edem Wornyo,Andrew Zaldivar,Saška Mojsilović

As generative AI technologies are increasingly being launched across the globe, assessing their competence to operate in different cultural contexts is exigently becoming a priority. While recent years have seen numerous and much-needed efforts on cultural benchmarking, these efforts have largely focused on specific aspects of culture and evaluation. While these efforts contribute to our understanding of cultural competence, a unified and systematic evaluation approach is needed for us as a field to comprehensively assess diverse cultural dimensions at scale. Drawing on measurement theory, we present a principled framework to aggregate multifaceted indicators of cultural capabilities into a unified assessment of cultural intelligence. We start by developing a working definition of culture that includes identifying core domains of culture. We then introduce a broad-purpose, systematic, and extensible framework for assessing cultural intelligence of AI systems. Drawing on theoretical framing from psychometric measurement validity theory, we decouple the background concept (i.e., cultural intelligence) from its operationalization via measurement. We conceptualize cultural intelligence as a suite of core capabilities spanning diverse domains, which we then operationalize through a set of indicators designed for reliable measurement. Finally, we identify the considerations, challenges, and research pathways to meaningfully measure these indicators, specifically focusing on data collection, probing strategies, and evaluation metrics.

翻译：暂无翻译

相关内容

关注 7104

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

构建面向终端的 AI 编程智能体：脚手架、测试环境、上下文工程及实践经验

专知会员服务

23+阅读 · 3月8日

AI 智能体系统：体系架构、应用场景及评估范式

专知会员服务

68+阅读 · 1月6日

从Idea构想到论文发表：AI for Research全链路综述与实践

专知会员服务

23+阅读 · 2025年7月21日

AI专题·Agent：智能体基建厚积薄发，商业化应用曙光乍现

专知会员服务

34+阅读 · 2025年4月24日