WonderPlay is a novel framework integrating physics simulation with video generation for generating action-conditioned dynamic 3D scenes from a single image. While prior works are restricted to rigid body or simple elastic dynamics, WonderPlay features a hybrid generative simulator to synthesize a wide range of 3D dynamics. The hybrid generative simulator first uses a physics solver to simulate coarse 3D dynamics, which subsequently conditions a video generator to produce a video with finer, more realistic motion. The generated video is then used to update the simulated dynamic 3D scene, closing the loop between the physics solver and the video generator. This approach enables intuitive user control to be combined with the accurate dynamics of physics-based simulators and the expressivity of diffusion-based video generators. Experimental results demonstrate that WonderPlay enables users to interact with various scenes of diverse content, including cloth, sand, snow, liquid, smoke, elastic, and rigid bodies -- all using a single image input. Code will be made public. Project website: https://kyleleey.github.io/WonderPlay/
翻译:WonderPlay是一种新颖的框架,它将物理模拟与视频生成相结合,用于从单张图像生成动作条件化的动态三维场景。以往的研究局限于刚体或简单弹性体动力学,而WonderPlay采用了一种混合生成模拟器,能够合成广泛的三维动态效果。该混合生成模拟器首先使用物理求解器模拟粗略的三维动态,随后以此作为条件驱动视频生成器,生成具有更精细、更逼真运动的视频。生成的视频随后被用于更新模拟的动态三维场景,从而在物理求解器与视频生成器之间形成闭环。这种方法使用户的直观控制能够与基于物理的模拟器的精确动力学特性以及基于扩散的视频生成器的表达能力相结合。实验结果表明,WonderPlay使用户能够与包含布料、沙子、雪、液体、烟雾、弹性体及刚体等多种内容的场景进行交互——所有这些仅需单张图像输入。代码将公开。项目网站:https://kyleleey.github.io/WonderPlay/