We present MeshFlow, a new method for generating artist-like 3D meshes. Current mesh generators often adopt Auto-Regressive (AR) next-token prediction, a natural choice given the discrete nature of mesh topology. However, AR methods scale poorly because the inference cost is quadratic in mesh size. They also require discretizing the vertex coordinates, which introduces quantization errors. To address these challenges, we introduce a Variational Autoencoder (VAE) that, supervised with a contrastive loss, represents both continuous vertex positions and discrete connectivity in a continuous latent space. This latent space is significantly more compact than prior token-based mesh representations. We then build a 3D generator based on a Rectified Flow transformer, generating all mesh vertices and edges in parallel. Our model generates meshes 18x faster than the fastest AR generator while also achieving excellent accuracy across standard mesh-generation metrics. Homepage: https://mesh-flow.github.io/, Code: https://github.com/facebookresearch/meshflow
翻译:我们提出MeshFlow,一种生成艺术级三维网格的新方法。当前网格生成器通常采用自回归(AR)的下一个标记预测,鉴于网格拓扑的离散特性,这是自然选择。然而,AR方法扩展性差,因为推理成本随网格规模呈二次增长。它们还需对顶点坐标进行离散化,导致量化误差。为了解决这些问题,我们引入一个变分自编码器(VAE),该编码器在对比损失监督下,将连续顶点位置和离散连接性共同表示为连续隐空间。该隐空间比先前基于标记的网格表示方式紧凑得多。随后,我们基于修正流(Rectified Flow)变换器构建3D生成器,并行生成所有网格顶点和边。我们的模型生成网格速度比最快的AR生成器快18倍,同时在标准网格生成指标上达到卓越精度。主页:https://mesh-flow.github.io/,代码:https://github.com/facebookresearch/meshflow