Existing pretrained models for 3D mesh generation often suffer from data biases and produce low-quality results, while global reinforcement learning (RL) methods rely on object-level rewards that struggle to capture local structure details. To address these challenges, we present Mesh-RFT, a novel fine-grained reinforcement fine-tuning framework that employs Masked Direct Preference Optimization (M-DPO) to enable localized refinement via quality-aware face masking. To facilitate efficient quality evaluation, we introduce an objective topology-aware scoring system to evaluate geometric integrity and topological regularity at both object and face levels through two metrics: Boundary Edge Ratio (BER) and Topology Score (TS). By integrating these metrics into a fine-grained RL strategy, Mesh-RFT becomes the first method to optimize mesh quality at the granularity of individual faces, resolving localized errors while preserving global coherence. Experiment results show that our M-DPO approach reduces Hausdorff Distance (HD) by 24.6% and improves Topology Score (TS) by 3.8% over pre-trained models, while outperforming global DPO methods with a 17.4% HD reduction and 4.9% TS gain. These results demonstrate Mesh-RFT's ability to improve geometric integrity and topological regularity, achieving new state-of-the-art performance in production-ready mesh generation. Project Page: https://hitcslj.github.io/mesh-rft/.
翻译:现有的三维网格生成预训练模型常受数据偏差影响,产生低质量结果,而全局强化学习方法依赖的对象级奖励难以捕捉局部结构细节。为解决这些问题,我们提出Mesh-RFT——一种新颖的细粒度强化微调框架,采用掩码直接偏好优化方法,通过质量感知的面片掩码实现局部优化。为建立高效的质量评估机制,我们引入客观的拓扑感知评分系统,通过边界边缘比率和拓扑分数两项指标,在物体层面和面片层面同步评估几何完整性与拓扑规则性。通过将这两项指标融入细粒度强化学习策略,Mesh-RFT成为首个能在单个面片粒度优化网格质量的方法,在解决局部误差的同时保持全局一致性。实验结果表明:相较于预训练模型,我们的M-DPO方法将豪斯多夫距离降低24.6%,拓扑分数提升3.8%;与全局DPO方法相比,豪斯多夫距离减少17.4%,拓扑分数提高4.9%。这些结果证明了Mesh-RFT在提升几何完整性与拓扑规则性方面的能力,为生产级网格生成实现了新的最先进性能。项目主页:https://hitcslj.github.io/mesh-rft/。