Large language models (LLMs) are increasingly deployed in multicultural settings; however, systematic evaluation of cultural specificity at the sentence level remains underexplored. We propose the Conceptual Cultural Index (CCI), which estimates cultural specificity at the sentence level. CCI is defined as the difference between the generality estimate within the target culture and the average generality estimate across other cultures. This formulation enables users to operationally control the scope of culture via comparison settings and provides interpretability, since the score derives from the underlying generality estimates. We validate CCI on 400 sentences (200 culture-specific and 200 general), and the resulting score distribution exhibits the anticipated pattern: higher for culture-specific sentences and lower for general ones. For binary separability, CCI outperforms direct LLM scoring, yielding more than a 10-point improvement in AUC for models specialized to the target culture. Our code is available at https://github.com/IyatomiLab/CCI .
翻译:大型语言模型(LLM)在多文化环境中的应用日益广泛;然而,在句子层面对文化特异性进行系统性评估的研究仍然不足。我们提出了概念文化指数(CCI),用于在句子层面估计文化特异性。CCI被定义为目标文化内部的普遍性估计值与跨其他文化的平均普遍性估计值之间的差值。这一公式使用户能够通过比较设置来操作性地控制文化范围,并提供了可解释性,因为该分数源自底层的普遍性估计。我们在400个句子(200个文化特异性句子和200个通用句子)上验证了CCI,得到的分数分布呈现出预期模式:文化特异性句子的分数较高,而通用句子的分数较低。在二元可分离性方面,CCI优于直接的LLM评分,对于针对目标文化专门化的模型,其AUC提高了超过10个百分点。我们的代码可在 https://github.com/IyatomiLab/CCI 获取。