Recent generative models can create visually plausible 3D representations of objects. However, the generation process often allows for implicit control signals, such as contextual descriptions, and rarely supports bold geometric distortions beyond existing data distributions. We propose a geometric stylization framework that deforms a 3D mesh, allowing it to express the style of an image. While style is inherently ambiguous, we utilize pre-trained diffusion models to extract an abstract representation of the provided image. Our coarse-to-fine stylization pipeline can drastically deform the input 3D model to express a diverse range of geometric variations while retaining the valid topology of the original mesh and part-level semantics. We also propose an approximate VAE encoder that provides efficient and reliable gradients from mesh renderings. Extensive experiments demonstrate that our method can create stylized 3D meshes that reflect unique geometric features of the pictured assets, such as expressive poses and silhouettes, thereby supporting the creation of distinctive artistic 3D creations. Project page: https://changwoonchoi.github.io/GeoStyle
翻译:最近的生成模型能够创建视觉上逼真的物体三维表示。然而,生成过程通常允许使用隐式控制信号(如上下文描述),并且很少支持超出现有数据分布的显著几何变形。我们提出了一种几何风格化框架,该框架通过变形三维网格,使其能够表达图像的风格。尽管风格本质上是模糊的,但我们利用预训练的扩散模型来提取所提供图像的抽象表示。我们的粗到细风格化流水线能够大幅度变形输入的三维模型,以表达多样化的几何变体,同时保留原始网格的有效拓扑结构和部件级语义。我们还提出了一种近似VAE编码器,可从网格渲染中提供高效且可靠的梯度。大量实验表明,我们的方法可以创建风格化的三维网格,这些网格能够反映图像中物体的独特几何特征(例如富有表现力的姿态和轮廓),从而支持创作独特的艺术性三维作品。项目页面:https://changwoonchoi.github.io/GeoStyle