Human Preference Score: Better Aligning Text-to-Image Models with Human Preference

Recent years have witnessed a rapid growth of deep generative models, with text-to-image models gaining significant attention from the public. However, existing models often generate images that do not align well with human preferences, such as awkward combinations of limbs and facial expressions. To address this issue, we collect a dataset of human choices on generated images from the Stable Foundation Discord channel. Our experiments demonstrate that current evaluation metrics for generative models do not correlate well with human choices. Thus, we train a human preference classifier with the collected dataset and derive a Human Preference Score (HPS) based on the classifier. Using HPS, we propose a simple yet effective method to adapt Stable Diffusion to better align with human preferences. Our experiments show that HPS outperforms CLIP in predicting human choices and has good generalization capability toward images generated from other models. By tuning Stable Diffusion with the guidance of HPS, the adapted model is able to generate images that are more preferred by human users. The project page is available here: https://tgxs002.github.io/align_sd_web/ .

翻译：近年来，深度生成模型快速发展，其中文本到图像模型引起了公众的广泛关注。然而，现有模型生成的图像往往与人类偏好不完全一致，例如肢体和面部表情的不协调组合。为解决这一问题，我们从Stable Foundation的Discord频道收集了一个关于生成图像的人类选择数据集。实验表明，当前生成模型的评估指标与人类选择的相关性不佳。因此，我们利用所收集的数据集训练了一个人类偏好分类器，并基于该分类器导出了人类偏好分数（HPS）。基于HPS，我们提出了一种简单有效的方法，使Stable Diffusion模型更好地对齐人类偏好。实验显示，HPS在预测人类选择方面优于CLIP，并对其他模型生成的图像具有良好的泛化能力。通过HPS引导微调Stable Diffusion，调整后的模型能够生成更受人类用户偏好的图像。项目页面详见：https://tgxs002.github.io/align_sd_web/ 。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日