We target a 3D generative model for general natural scenes that are typically unique and intricate. Lacking the necessary volumes of training data, along with the difficulties of having ad hoc designs in presence of varying scene characteristics, renders existing setups intractable. Inspired by classical patch-based image models, we advocate for synthesizing 3D scenes at the patch level, given a single example. At the core of this work lies important algorithmic designs w.r.t the scene representation and generative patch nearest-neighbor module, that address unique challenges arising from lifting classical 2D patch-based framework to 3D generation. These design choices, on a collective level, contribute to a robust, effective, and efficient model that can generate high-quality general natural scenes with both realistic geometric structure and visual appearance, in large quantities and varieties, as demonstrated upon a variety of exemplar scenes.
翻译:我们针对通常独特且复杂的通用自然场景,提出一种三维生成模型。由于缺乏必要的训练数据量,且在不同场景特征下难以采用特设设计,现有方法难以处理此类任务。受经典图像补丁模型启发,我们主张在给定单个示例的情况下,于补丁级别合成三维场景。本工作的核心在于针对场景表示与生成式补丁最近邻模块的关键算法设计,以解决将经典二维补丁框架提升至三维生成时面临的独特挑战。这些设计选择在整体层面共同构成了一个鲁棒、高效且有效的模型,能够生成大量多样化的高质量通用自然场景,兼具逼真的几何结构与视觉外观,在多种示例场景上得到了验证。