In this paper, we present a novel shape reconstruction method leveraging diffusion model to generate 3D sparse point cloud for the object captured in a single RGB image. Recent methods typically leverage global embedding or local projection-based features as the condition to guide the diffusion model. However, such strategies fail to consistently align the denoised point cloud with the given image, leading to unstable conditioning and inferior performance. In this paper, we present CCD-3DR, which exploits a novel centered diffusion probabilistic model for consistent local feature conditioning. We constrain the noise and sampled point cloud from the diffusion model into a subspace where the point cloud center remains unchanged during the forward diffusion process and reverse process. The stable point cloud center further serves as an anchor to align each point with its corresponding local projection-based features. Extensive experiments on synthetic benchmark ShapeNet-R2N2 demonstrate that CCD-3DR outperforms all competitors by a large margin, with over 40% improvement. We also provide results on real-world dataset Pix3D to thoroughly demonstrate the potential of CCD-3DR in real-world applications. Codes will be released soon
翻译:本文提出了一种新颖的形状重建方法,利用扩散模型为单张RGB图像中的目标对象生成三维稀疏点云。现有方法通常采用全局嵌入或基于局部投影的特征作为条件来引导扩散模型,然而此类策略无法使去噪点云与输入图像保持一致性对齐,导致条件约束不稳定且性能欠佳。本文提出的CCD-3DR方法基于一种新型中心化扩散概率模型,实现了局部特征的一致条件约束。我们将扩散模型中的噪声和采样点云约束于一个子空间内,使得在前向扩散过程与反向过程中点云中心保持不变。稳定的点云中心可作为锚点,将每个点与其对应的局部投影特征对齐。在合成基准数据集ShapeNet-R2N2上的大量实验表明,CCD-3DR以超过40%的性能提升大幅度超越所有现有方法。我们还在真实世界数据集Pix3D上进行了实验,充分验证了CCD-3DR在真实场景中的应用潜力。代码将很快开源。