RAIGen: Rare Attribute Identification in Text-to-Image Generative Models

Text-to-image diffusion models achieve impressive generation quality but inherit and amplify training-data biases, skewing coverage of semantic attributes. Prior work addresses this in two ways. Closed-set approaches mitigate biases in predefined fairness categories (e.g., gender, race), assuming socially salient minority attributes are known a priori. Open-set approaches frame the task as bias identification, highlighting majority attributes that dominate outputs. Both overlook a complementary task: uncovering rare or minority features underrepresented in the data distribution (social, cultural, or stylistic) yet still encoded in model representations. We introduce RAIGen, the first framework, to our knowledge, for label-free rare-attribute discovery in diffusion models, requiring no predefined minority categories. RAIGen leverages Matryoshka Sparse Autoencoders and a novel minority metric combining neuron activation frequency with semantic distinctiveness to identify interpretable neurons whose top-activating images reveal underrepresented attributes. Experiments show RAIGen discovers attributes beyond fixed fairness categories in Stable Diffusion, scales to larger models such as SDXL, supports systematic auditing across architectures, and enables targeted amplification of rare attributes during generation. The project page is available at https://vssilpa.github.io/RAIGen_webpage/ .

翻译：文本到图像扩散模型尽管实现了令人瞩目的生成质量，但会继承并放大训练数据中的偏差，导致语义属性覆盖不均。以往研究从两个方向予以应对：封闭式方法在预定义公平性类别（如性别、种族）中缓解偏差，假设具有社会敏感性的少数属性是已知的；开放式方法则将任务框架定义为偏差识别，突出主导输出的大多数属性。两者均忽略了一个互补任务：发掘数据分布中代表性不足的稀有或少数特征（涉及社会、文化或风格层面），而此类特征仍被编码在模型表征中。我们提出RAIGen——据我们所知，首个用于扩散模型无标签稀有属性发现的框架，无需预定义少数类别。RAIGen利用俄罗斯套娃稀疏自编码器，结合神经元激活频率与语义独特性的新颖少数度量指标，识别出那些其最高激活图像能揭示不具代表性属性的可解释神经元。实验表明，RAIGen在Stable Diffusion模型中能发现超越固定公平性类别的属性，可扩展至SDXL等更大模型，支持跨架构的系统性审计，并在生成过程中实现对稀有属性的定向增强。项目页面：https://vssilpa.github.io/RAIGen_webpage/ 。

相关内容

属性

关注 2

一个具体事物，总是有许许多多的性质与关系，我们把一个事物的性质与关系，都叫作事物的属性。事物与属性是不可分的，事物都是有属性的事物，属性也都是事物的属性。一个事物与另一个事物的相同或相异，也就是一个事物的属性与另一事物的属性的相同或相异。由于事物属性的相同或相异，客观世界中就形成了许多不同的事物类。具有相同属性的事物就形成一类，具有不同属性的事物就分别地形成不同的类。

【NeurIPS2025】Seg4Diff：揭示文本到图像扩散 Transformer 中的开放词汇分割

专知会员服务

10+阅读 · 2025年9月23日

IMAGINE-E：最先进文本到图像模型的图像生成智能评估

专知会员服务

13+阅读 · 2025年2月3日

【NeurIPS2024】释放扩散模型在小样本语义分割中的潜力

专知会员服务

17+阅读 · 2024年10月4日

【CVPR2024】OpenBias: 文本到图像生成模型中的开放集偏见检测

专知会员服务

15+阅读 · 2024年4月14日