MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences

Recent years have witnessed remarkable progress in 3D content generation. However, corresponding evaluation methods struggle to keep pace. Automatic approaches have proven challenging to align with human preferences, and the mixed comparison of text- and image-driven methods often leads to unfair evaluations. In this paper, we present a comprehensive framework to better align and evaluate multi-view diffusion models with human preferences. To begin with, we first collect and filter a standardized image prompt set from DALL$\cdot$E and Objaverse, which we then use to generate multi-view assets with several multi-view diffusion models. Through a systematic ranking pipeline on these assets, we obtain a human annotation dataset with 16k expert pairwise comparisons and train a reward model, coined MVReward, to effectively encode human preferences. With MVReward, image-driven 3D methods can be evaluated against each other in a more fair and transparent manner. Building on this, we further propose Multi-View Preference Learning (MVP), a plug-and-play multi-view diffusion tuning strategy. Extensive experiments demonstrate that MVReward can serve as a reliable metric and MVP consistently enhances the alignment of multi-view diffusion models with human preferences.

翻译：近年来，三维内容生成领域取得了显著进展，然而相应的评估方法却难以跟上发展步伐。现有自动评估方法已被证明难以与人类偏好保持一致，且文本驱动与图像驱动方法的混合比较常导致不公平的评估结果。本文提出一个综合性框架，以更好地基于人类偏好对齐与评估多视角扩散模型。首先，我们从DALL$\cdot$E和Objaverse中收集并筛选出标准化图像提示集，随后使用多个多视角扩散模型基于该提示集生成多视角资源。通过对这些资源进行系统性排序流程，我们获得了包含1.6万组专家两两比较的人类标注数据集，并训练出能有效编码人类偏好的奖励模型MVReward。借助MVReward，图像驱动的三维生成方法得以在更公平透明的框架下进行相互比较。在此基础上，我们进一步提出多视角偏好学习（MVP）——一种即插即用的多视角扩散模型调优策略。大量实验表明，MVReward可作为可靠的评估指标，而MVP能持续提升多视角扩散模型与人类偏好的对齐程度。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日