Margin-aware Preference Optimization for Aligning Diffusion Models without Reference

Modern alignment techniques based on human preferences, such as RLHF and DPO, typically employ divergence regularization relative to the reference model to ensure training stability. However, this often limits the flexibility of models during alignment, especially when there is a clear distributional discrepancy between the preference data and the reference model. In this paper, we focus on the alignment of recent text-to-image diffusion models, such as Stable Diffusion XL (SDXL), and find that this "reference mismatch" is indeed a significant problem in aligning these models due to the unstructured nature of visual modalities: e.g., a preference for a particular stylistic aspect can easily induce such a discrepancy. Motivated by this observation, we propose a novel and memory-friendly preference alignment method for diffusion models that does not depend on any reference model, coined margin-aware preference optimization (MaPO). MaPO jointly maximizes the likelihood margin between the preferred and dispreferred image sets and the likelihood of the preferred sets, simultaneously learning general stylistic features and preferences. For evaluation, we introduce two new pairwise preference datasets, which comprise self-generated image pairs from SDXL, Pick-Style and Pick-Safety, simulating diverse scenarios of reference mismatch. Our experiments validate that MaPO can significantly improve alignment on Pick-Style and Pick-Safety and general preference alignment when used with Pick-a-Pic v2, surpassing the base SDXL and other existing methods. Our code, models, and datasets are publicly available via https://mapo-t2i.github.io

翻译：基于人类偏好的现代对齐技术，例如RLHF和DPO，通常采用相对于参考模型的散度正则化来确保训练稳定性。然而，这常常限制了对齐过程中模型的灵活性，尤其是在偏好数据与参考模型之间存在明显分布差异时。本文聚焦于近期文本到图像扩散模型（如Stable Diffusion XL (SDXL)）的对齐，并发现由于视觉模态的非结构化特性，这种“参考失配”确实是此类模型对齐中的一个显著问题：例如，对特定风格方面的偏好很容易引发此类差异。基于此观察，我们提出了一种新颖且内存友好的扩散模型偏好对齐方法，该方法不依赖于任何参考模型，称为边界感知偏好优化（MaPO）。MaPO联合最大化偏好图像集与非偏好图像集之间的似然边界以及偏好图像集的似然，同时学习通用风格特征与偏好。为进行评估，我们引入了两个新的成对偏好数据集Pick-Style和Pick-Safety，它们由SDXL自生成的图像对构成，模拟了参考失配的多种场景。我们的实验验证了MaPO在Pick-Style和Pick-Safety上以及当与Pick-a-Pic v2结合使用时，在通用偏好对齐方面均能显著提升对齐效果，超越了基础SDXL及其他现有方法。我们的代码、模型和数据集已通过 https://mapo-t2i.github.io 公开提供。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日