Novelty modeling and detection is a core topic in Natural Language Processing (NLP), central to numerous tasks such as recommender systems and automatic summarization. It involves identifying pieces of text that deviate in some way from previously known information. However, novelty is also a crucial determinant of the unique perception of relevance and quality of an experience, as it rests upon each individual's understanding of the world. Social factors, particularly cultural background, profoundly influence perceptions of novelty and innovation. Cultural novelty arises from differences in salience and novelty as shaped by the distance between distinct communities. While cultural diversity has garnered increasing attention in artificial intelligence (AI), the lack of robust metrics for quantifying cultural novelty hinders a deeper understanding of these divergences. This gap limits quantifying and understanding cultural differences within computational frameworks. To address this, we propose an interdisciplinary framework that integrates knowledge from sociology and management. Central to our approach is GlobalFusion, a novel dataset comprising 500 dishes and approximately 100,000 cooking recipes capturing cultural adaptation from over 150 countries. By introducing a set of Jensen-Shannon Divergence metrics for novelty, we leverage this dataset to analyze textual divergences when recipes from one community are modified by another with a different cultural background. The results reveal significant correlations between our cultural novelty metrics and established cultural measures based on linguistic, religious, and geographical distances. Our findings highlight the potential of our framework to advance the understanding and measurement of cultural diversity in AI.
翻译:新颖性建模与检测是自然语言处理(NLP)的核心课题,对推荐系统、自动摘要等众多任务至关重要。它涉及识别在某种程度上偏离已知信息的文本片段。然而,新颖性也是决定体验相关性与质量独特感知的关键因素,因为它基于个体对世界的理解。社会因素,特别是文化背景,深刻影响着对新颖性与创新的感知。文化新颖性源于不同社群间距离所塑造的显著性与新颖性差异。尽管文化多样性在人工智能(AI)领域日益受到关注,但缺乏量化文化新颖性的稳健指标阻碍了对这些差异的深入理解。这一局限制约了在计算框架内量化与理解文化差异的能力。为此,我们提出一个融合社会学与管理学知识的跨学科框架。我们方法的核心是GlobalFusion——一个新颖的数据集,包含500道菜肴及约100,000份烹饪食谱,涵盖了来自150多个国家的文化适应案例。通过引入一组基于Jensen-Shannon散度的新颖性度量指标,我们利用该数据集分析当某一社群的食谱被具有不同文化背景的另一社群修改时产生的文本差异。结果显示,我们的文化新颖性指标与基于语言、宗教和地理距离的既定文化度量指标之间存在显著相关性。我们的研究结果凸显了该框架在推进AI领域文化多样性理解与测量方面的潜力。