Ensuring that large language models (LLMs) respect diverse cultural values is crucial for social equity. However, existing approaches often treat cultural groups as homogeneous and overlook within-group heterogeneity induced by intersecting demographic attributes, leading to unstable behavior under varying persona granularity. We propose ACE-Align (Attribute Causal Effect Alignment), a causal-effect framework that aligns how specific demographic attributes shift different cultural values, rather than treating each culture as a homogeneous group. We evaluate ACE-Align across 14 countries spanning five continents, with personas specified by subsets of four attributes (gender, education, residence, and marital status) and granularity instantiated by the number of specified attributes. Across all persona granularities, ACE-Align consistently outperforms baselines. Moreover, it improves geographic equity by reducing the average alignment gap between high-resource and low-resource regions from 9.81 to 4.92 points, while Africa shows the largest average gain (+8.48 points). Code is available at https://github.com/Wells-Luo/ACE-Align.
翻译:确保大型语言模型(LLM)尊重多元文化价值观对于社会公平至关重要。然而,现有方法通常将文化群体视为同质整体,忽视了由交叉人口属性引发的群体内异质性,导致模型在不同人物画像粒度下行为不稳定。我们提出ACE-Align(属性因果效应对齐)——一种因果效应框架,该框架旨在对齐特定人口属性对不同文化价值观产生的偏移效应,而非将每种文化视为同质群体。我们在横跨五大洲的14个国家中评估ACE-Align,其中人物画像通过四个属性(性别、教育程度、居住地、婚姻状况)的子集定义,粒度由已指定属性的数量实例化。在所有人物画像粒度下,ACE-Align均持续优于基线方法。此外,该方法通过将高资源与低资源地区间的平均对齐差距从9.81分降低至4.92分,提升了地域公平性,其中非洲地区显示出最大的平均增益(+8.48分)。代码发布于 https://github.com/Wells-Luo/ACE-Align。