Diffusion Model Alignment Using Direct Preference Optimization

Large language models (LLMs) are fine-tuned using human comparison data with Reinforcement Learning from Human Feedback (RLHF) methods to make them better aligned with users' preferences. In contrast to LLMs, human preference learning has not been widely explored in text-to-image diffusion models; the best existing approach is to fine-tune a pretrained model using carefully curated high quality images and captions to improve visual appeal and text alignment. We propose Diffusion-DPO, a method to align diffusion models to human preferences by directly optimizing on human comparison data. Diffusion-DPO is adapted from the recently developed Direct Preference Optimization (DPO), a simpler alternative to RLHF which directly optimizes a policy that best satisfies human preferences under a classification objective. We re-formulate DPO to account for a diffusion model notion of likelihood, utilizing the evidence lower bound to derive a differentiable objective. Using the Pick-a-Pic dataset of 851K crowdsourced pairwise preferences, we fine-tune the base model of the state-of-the-art Stable Diffusion XL (SDXL)-1.0 model with Diffusion-DPO. Our fine-tuned base model significantly outperforms both base SDXL-1.0 and the larger SDXL-1.0 model consisting of an additional refinement model in human evaluation, improving visual appeal and prompt alignment. We also develop a variant that uses AI feedback and has comparable performance to training on human preferences, opening the door for scaling of diffusion model alignment methods.

翻译：大型语言模型通过使用人类比较数据和基于人类反馈的强化学习方法进行微调，以使其更好地符合用户偏好。与语言模型不同，文本到图像扩散模型中的人类偏好学习尚未得到广泛探索；现有最佳方法是通过精细挑选的高质量图像和描述对预训练模型进行微调，以提升视觉吸引力和文本对齐程度。我们提出Diffusion-DPO方法，通过直接在人类比较数据上进行优化来实现扩散模型与人类偏好的对齐。Diffusion-DPO改编自近期发展的直接偏好优化方法，这是一种更简单的替代RLHF的方案，通过分类目标直接优化最符合人类偏好的策略。我们重新构建DPO以考虑扩散模型的似然概念，利用证据下界推导出可微的目标函数。采用包含85.1万组众包成对偏好的Pick-a-Pic数据集，我们通过Diffusion-DPO对最先进的Stable Diffusion XL (SDXL)-1.0模型的基础模型进行微调。微调后的基础模型在人类评估中显著优于原始SDXL-1.0基线模型及包含额外精炼模块的更大规模SDXL-1.0模型，同时提升了视觉吸引力和提示对齐能力。我们还开发了一种使用AI反馈的变体方法，其性能可与基于人类偏好的训练相媲美，为扩散模型对齐方法的规模化应用开辟了道路。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Graph Transformer近期进展

专知会员服务

65+阅读 · 2023年1月5日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日