Single-image 3D reconstruction with large reconstruction models (LRMs) has advanced rapidly, yet reconstructions often exhibit geometric inconsistencies and misaligned details that limit fidelity. We introduce GeoFusionLRM, a geometry-aware self-correction framework that leverages the model's own normal and depth predictions to refine structural accuracy. Unlike prior approaches that rely solely on features extracted from the input image, GeoFusionLRM feeds back geometric cues through a dedicated transformer and fusion module, enabling the model to correct errors and enforce consistency with the conditioning image. This design improves the alignment between the reconstructed mesh and the input views without additional supervision or external signals. Extensive experiments demonstrate that GeoFusionLRM achieves sharper geometry, more consistent normals, and higher fidelity than state-of-the-art LRM baselines.
翻译:基于大型重建模型(LRM)的单图像三维重建技术发展迅速,但重建结果常存在几何不一致与细节错位问题,限制了重建保真度。本文提出GeoFusionLRM——一种几何感知的自校正框架,该框架利用模型自身预测的法向量与深度信息来优化结构精度。与以往仅依赖输入图像特征的方法不同,GeoFusionLRM通过专用的Transformer和融合模块反馈几何线索,使模型能够修正误差并增强与条件图像的一致性。该设计无需额外监督或外部信号即可改善重建网格与输入视角之间的对齐效果。大量实验表明,相较于最先进的LRM基线方法,GeoFusionLRM能实现更清晰的几何结构、更一致的法向量分布以及更高的重建保真度。