ACE-Align: Attribute Causal Effect Alignment for Cultural Values under Varying Persona Granularities

Ensuring that large language models (LLMs) respect diverse cultural values is crucial for social equity. However, existing approaches often treat cultural groups as homogeneous and overlook within-group heterogeneity induced by intersecting demographic attributes, leading to unstable behavior under varying persona granularity. We propose ACE-Align (Attribute Causal Effect Alignment), a causal-effect framework that aligns how specific demographic attributes shift different cultural values, rather than treating each culture as a homogeneous group. We evaluate ACE-Align across 14 countries spanning five continents, with personas specified by subsets of four attributes (gender, education, residence, and marital status) and granularity instantiated by the number of specified attributes. Across all persona granularities, ACE-Align consistently outperforms baselines. Moreover, it improves geographic equity by reducing the average alignment gap between high-resource and low-resource regions from 9.81 to 4.92 points, while Africa shows the largest average gain (+8.48 points). Code is available at https://github.com/Wells-Luo/ACE-Align.

翻译：确保大型语言模型（LLM）尊重多元文化价值观对于社会公平至关重要。然而，现有方法通常将文化群体视为同质整体，忽视了由交叉人口属性引发的群体内异质性，导致模型在不同人物画像粒度下行为不稳定。我们提出ACE-Align（属性因果效应对齐）——一种因果效应框架，该框架旨在对齐特定人口属性对不同文化价值观产生的偏移效应，而非将每种文化视为同质群体。我们在横跨五大洲的14个国家中评估ACE-Align，其中人物画像通过四个属性（性别、教育程度、居住地、婚姻状况）的子集定义，粒度由已指定属性的数量实例化。在所有人物画像粒度下，ACE-Align均持续优于基线方法。此外，该方法通过将高资源与低资源地区间的平均对齐差距从9.81分降低至4.92分，提升了地域公平性，其中非洲地区显示出最大的平均增益（+8.48分）。代码发布于 https://github.com/Wells-Luo/ACE-Align。

相关内容

属性

关注 2

一个具体事物，总是有许许多多的性质与关系，我们把一个事物的性质与关系，都叫作事物的属性。事物与属性是不可分的，事物都是有属性的事物，属性也都是事物的属性。一个事物与另一个事物的相同或相异，也就是一个事物的属性与另一事物的属性的相同或相异。由于事物属性的相同或相异，客观世界中就形成了许多不同的事物类。具有相同属性的事物就形成一类，具有不同属性的事物就分别地形成不同的类。

大型语言模型中隐性与显性偏见的综合研究

专知会员服务

17+阅读 · 2025年11月25日

【AAAI2026】Align3GR：面向 LLM 生成式推荐的统一多层次对齐方法

专知会员服务

13+阅读 · 2025年11月17日