3D generation has rapidly accelerated in the past decade owing to the progress in the field of generative modeling. Score Distillation Sampling (SDS) based rendering has improved 3D asset generation to a great extent. Further, the recent work of Denoising Diffusion Policy Optimization (DDPO) demonstrates that the diffusion process is compatible with policy gradient methods and has been demonstrated to improve the 2D diffusion models using an aesthetic scoring function. We first show that this aesthetic scorer acts as a strong guide for a variety of SDS-based methods and demonstrates its effectiveness in text-to-3D synthesis. Further, we leverage the DDPO approach to improve the quality of the 3D rendering obtained from 2D diffusion models. Our approach, DDPO3D, employs the policy gradient method in tandem with aesthetic scoring. To the best of our knowledge, this is the first method that extends policy gradient methods to 3D score-based rendering and shows improvement across SDS-based methods such as DreamGaussian, which are currently driving research in text-to-3D synthesis. Our approach is compatible with score distillation-based methods, which would facilitate the integration of diverse reward functions into the generative process. Our project page can be accessed via https://ddpo3d.github.io.
翻译:三维生成在过去十年中因生成建模领域的进展而迅速加速。基于分数蒸馏采样(SDS)的渲染方法极大地提升了三维资产生成的质量。此外,近期提出的去噪扩散策略优化(DDPO)研究表明,扩散过程与策略梯度方法兼容,并已通过美学评分函数改进二维扩散模型。我们首先证明,这种美学评分器可作为多种SDS方法的强有力指引,并在文本到三维合成中展示其有效性。进一步,我们利用DDPO方法提升二维扩散模型所生成的三维渲染质量。我们的方法DDPO3D将策略梯度方法与美学评分相结合。据我们所知,这是首次将策略梯度方法扩展到基于分数的三维渲染中,并展现了其对当前驱动文本到三维合成研究的SDS方法(如DreamGaussian)的改进效果。我们的方法与基于分数蒸馏的方法兼容,从而便于将多种奖励函数集成至生成过程中。项目页面可通过 https://ddpo3d.github.io 访问。