We propose FrePolad: frequency-rectified point latent diffusion, a point cloud generation pipeline integrating a variational autoencoder (VAE) with a denoising diffusion probabilistic model (DDPM) for the latent distribution. FrePolad simultaneously achieves high quality, diversity, and flexibility in point cloud cardinality for generation tasks while maintaining high computational efficiency. The improvement in generation quality and diversity is achieved through (1) a novel frequency rectification via spherical harmonics designed to retain high-frequency content while learning the point cloud distribution; and (2) a latent DDPM to learn the regularized yet complex latent distribution. In addition, FrePolad supports variable point cloud cardinality by formulating the sampling of points as conditional distributions over a latent shape distribution. Finally, the low-dimensional latent space encoded by the VAE contributes to FrePolad's fast and scalable sampling. Our quantitative and qualitative results demonstrate FrePolad's state-of-the-art performance in terms of quality, diversity, and computational efficiency. Project page: https://chenliang-zhou.github.io/FrePolad/.
翻译:我们提出FrePolad:一种频率校正的点潜在扩散模型,该点云生成流水线将变分自编码器(VAE)与用于潜在分布的降噪扩散概率模型(DDPM)相结合。FrePolad在生成任务中同时实现了高质量、高多样性以及点云基数的灵活性,同时保持了较高的计算效率。生成质量和多样性的提升通过以下方式实现:(1) 通过球谐函数设计的新型频率校正方法,旨在学习点云分布的同时保留高频内容;(2) 使用潜在DDPM来学习经过正则化但仍复杂的潜在分布。此外,FrePolad通过将点采样公式化为潜在形状分布上的条件分布,从而支持可变的点云基数。最后,由VAE编码的低维潜在空间有助于FrePolad实现快速且可扩展的采样。我们的定量和定性结果表明,FrePolad在质量、多样性和计算效率方面均达到了最先进的性能。项目页面:https://chenliang-zhou.github.io/FrePolad/。