We present PhyCAGE, the first approach for physically plausible compositional 3D asset generation from a single image. Given an input image, we first generate consistent multi-view images for components of the assets. These images are then fitted with 3D Gaussian Splatting representations. To ensure that the Gaussians representing objects are physically compatible with each other, we introduce a Physical Simulation-Enhanced Score Distillation Sampling (PSE-SDS) technique to further optimize the positions of the Gaussians. It is achieved by setting the gradient of the SDS loss as the initial velocity of the physical simulation, allowing the simulator to act as a physics-guided optimizer that progressively corrects the Gaussians' positions to a physically compatible state. Experimental results demonstrate that the proposed method can generate physically plausible compositional 3D assets given a single image.
翻译:我们提出了PhyCAGE,这是首个从单张图像生成物理合理组合式三维资产的方法。给定输入图像,我们首先生成资产各组件的一致性多视角图像。随后,这些图像通过三维高斯溅射表示进行拟合。为确保表示物体的高斯彼此物理兼容,我们引入了物理仿真增强的分数蒸馏采样技术,以进一步优化高斯分布的位置。该方法通过将SDS损失的梯度设置为物理仿真的初始速度实现,使仿真器作为物理引导的优化器,逐步将高斯分布的位置校正至物理兼容状态。实验结果表明,所提方法能够基于单张图像生成物理合理的组合式三维资产。