3D editing has emerged as a critical research area to provide users with flexible control over 3D assets. While current editing approaches predominantly focus on 3D Gaussian Splatting or multi-view images, the direct editing of 3D meshes remains underexplored. Prior attempts, such as VoxHammer, rely on voxel-based representations that suffer from limited resolution and necessitate labor-intensive 3D mask. To address these limitations, we propose \textbf{VecSet-Edit}, the first pipeline that leverages the high-fidelity VecSet Large Reconstruction Model (LRM) as a backbone for mesh editing. Our approach is grounded on a analysis of the spatial properties in VecSet tokens, revealing that token subsets govern distinct geometric regions. Based on this insight, we introduce Mask-guided Token Seeding and Attention-aligned Token Gating strategies to precisely localize target regions using only 2D image conditions. Also, considering the difference between VecSet diffusion process versus voxel we design a Drift-aware Token Pruning to reject geometric outliers during the denoising process. Finally, our Detail-preserving Texture Baking module ensures that we not only preserve the geometric details of original mesh but also the textural information. More details can be found in our project page: https://github.com/BlueDyee/VecSet-Edit/tree/main
翻译:三维编辑已成为一个关键研究领域,旨在为用户提供对三维资产的灵活控制。当前编辑方法主要集中于三维高斯泼溅或多视图图像,而直接编辑三维网格的研究仍显不足。先前尝试(如VoxHammer)依赖基于体素的表示,存在分辨率受限且需要费力的三维掩码标注等问题。为克服这些限制,我们提出**VecSet-Edit**——首个以高保真VecSet大型重建模型(LRM)为骨干的网格编辑流程。我们的方法基于对VecSet令牌空间特性的分析,发现令牌子集控制着不同的几何区域。基于此洞见,我们引入掩码引导令牌播种与注意力对齐令牌门控策略,仅使用二维图像条件即可精确定位目标区域。同时,考虑到VecSet扩散过程与体素方法的差异,我们设计了漂移感知令牌剪枝机制,在去噪过程中剔除几何异常值。最后,我们的细节保持纹理烘焙模块确保不仅保留原始网格的几何细节,还保留纹理信息。更多细节请访问项目页面:https://github.com/BlueDyee/VecSet-Edit/tree/main