In the realm of 3D computer vision, parametric models have emerged as a ground-breaking methodology for the creation of realistic and expressive 3D avatars. Traditionally, they rely on Principal Component Analysis (PCA), given its ability to decompose data to an orthonormal space that maximally captures shape variations. However, due to the orthogonality constraints and the global nature of PCA's decomposition, these models struggle to perform localized and disentangled editing of 3D shapes, which severely affects their use in applications requiring fine control such as face sculpting. In this paper, we leverage diffusion models to enable diverse and fully localized edits on 3D meshes, while completely preserving the un-edited regions. We propose an effective diffusion masking training strategy that, by design, facilitates localized manipulation of any shape region, without being limited to predefined regions or to sparse sets of predefined control vertices. Following our framework, a user can explicitly set their manipulation region of choice and define an arbitrary set of vertices as handles to edit a 3D mesh. Compared to the current state-of-the-art our method leads to more interpretable shape manipulations than methods relying on latent code state, greater localization and generation diversity while offering faster inference than optimization based approaches. Project page: https://rolpotamias.github.io/Shapefusion/
翻译:在三维计算机视觉领域,参数化模型已成为创建逼真且富有表现力的三维虚拟角色的突破性方法论。传统上,这些模型依赖主成分分析(PCA),因其能够将数据分解到最大程度捕捉形状变异的标准正交空间中。然而,受正交性约束及PCA分解的全局特性影响,这些模型难以实现三维形状的局部化与解耦编辑,严重制约其在需要精细控制(如面部雕刻)的应用场景中的使用。本文我们利用扩散模型实现三维网格上的多样化全局部编辑,同时完整保留未编辑区域。我们提出一种有效的扩散掩码训练策略,通过设计实现任意形状区域的局部化操控,不受限于预设区域或稀疏的预定义控制顶点集。基于本框架,用户可明确选择操控区域,并定义任意顶点集作为手柄来编辑三维网格。相较于当前最先进方法,本方法比依赖潜在编码状态的方法具有更可解释的形状操控能力,比基于优化的方法具备更强的局部化能力和生成多样性,同时推理速度更快。项目页面:https://rolpotamias.github.io/Shapefusion/