Generative models aim to learn the distribution of observed data by generating new instances. With the advent of neural networks, deep generative models, including variational autoencoders (VAEs), generative adversarial networks (GANs), and diffusion models (DMs), have progressed remarkably in synthesizing 2D images. Recently, researchers started to shift focus from 2D to 3D space, considering that 3D data is more closely aligned with our physical world and holds immense practical potential. However, unlike 2D images, which possess an inherent and efficient representation (\textit{i.e.}, a pixel grid), representing 3D data poses significantly greater challenges. Ideally, a robust 3D representation should be capable of accurately modeling complex shapes and appearances while being highly efficient in handling high-resolution data with high processing speeds and low memory requirements. Regrettably, existing 3D representations, such as point clouds, meshes, and neural fields, often fail to satisfy all of these requirements simultaneously. In this survey, we thoroughly review the ongoing developments of 3D generative models, including methods that employ 2D and 3D supervision. Our analysis centers on generative models, with a particular focus on the representations utilized in this context. We believe our survey will help the community to track the field's evolution and to spark innovative ideas to propel progress towards solving this challenging task.
翻译:生成模型旨在通过学习观测数据的分布来生成新实例。随着神经网络的兴起,包括变分自编码器(VAEs)、生成对抗网络(GANs)和扩散模型(DMs)在内的深度生成模型在二维图像合成方面取得了显著进展。近年来,考虑到三维数据更贴近我们的物理世界且具有巨大的实际应用潜力,研究人员开始将关注点从二维转向三维空间。然而,与具有内在高效表示(即像素网格)的二维图像不同,三维数据的表示面临更大挑战。理想情况下,稳健的三维表示应能准确建模复杂形状与外观,同时在处理高分辨率数据时具备高处理速度和低内存需求的高效性。遗憾的是,现有的三维表示(如点云、网格和神经场)往往无法同时满足所有这些要求。本文全面梳理了三维生成模型的最新进展,涵盖使用二维和三维监督的方法。我们的分析聚焦于生成模型,尤其关注其中使用的表示方式。我们相信,本综述将帮助学界追踪该领域的演进脉络,并激发创新思路以推动这一挑战性任务的解决。