We investigate how to enhance the physical fidelity of video generation models by leveraging synthetic videos derived from computer graphics pipelines. These rendered videos respect real-world physics, such as maintaining 3D consistency, and serve as a valuable resource that can potentially improve video generation models. To harness this potential, we propose a solution that curates and integrates synthetic data while introducing a method to transfer its physical realism to the model, significantly reducing unwanted artifacts. Through experiments on three representative tasks emphasizing physical consistency, we demonstrate its efficacy in enhancing physical fidelity. While our model still lacks a deep understanding of physics, our work offers one of the first empirical demonstrations that synthetic video enhances physical fidelity in video synthesis. Website: https://kevinz8866.github.io/simulation/
翻译:本研究探讨如何利用计算机图形学流程生成的合成视频来增强视频生成模型的物理真实感。这些渲染视频遵循真实世界的物理规律(如保持三维一致性),是能够潜在改进视频生成模型的宝贵资源。为挖掘这一潜力,我们提出一种解决方案,通过筛选并整合合成数据,同时引入将合成视频物理真实感迁移至模型的方法,从而显著减少非期望的伪影。通过在三个强调物理一致性的代表性任务上进行实验,我们验证了该方法在提升物理真实感方面的有效性。尽管当前模型仍缺乏对物理规律的深层理解,但本研究首次通过实证表明:合成视频能够有效提升视频合成的物理真实感。项目网站:https://kevinz8866.github.io/simulation/