Recent feed-forward 3D reconstruction methods have demonstrated strong performance and flexibility in efficient end-to-end scene geometry estimation from image streams. However, their reliance on visible-light appearance makes them vulnerable in dark and low-visibility environments, where RGB cues are severely degraded and geometric evidence becomes ambiguous. To address this challenge, we propose DarkVGGT, an RGB-T feed-forward geometry framework that uses physics-aware thermal modeling for robust 3D estimation in low-light scenes. DarkVGGT introduces two complementary modules. First, physics-inspired thermal factorization extracts emissive-dominant, geometry-consistent thermal cues while isolating sparse reflective residuals that may introduce geometric ambiguity. Second, geometry-shared thermal routing isolates modality-invariant geometric structures from thermal-specific patterns, selectively injecting reliability-aware structural guidance into the RGB stream. Together, these components enable accurate thermal-informed geometry estimation under degraded RGB conditions while largely preserving performance in well-lit environments. Experiments on low-visibility RGB-T benchmarks demonstrate consistent improvements in both depth and camera pose estimation over existing feed-forward geometry baselines.
翻译:近期的前馈式三维重建方法在从图像流中高效端到端估计场景几何方面展现出强大性能与灵活性。然而,此类方法对可见光外观的依赖性使其在黑暗和低能见度环境中易受影响——此时RGB线索严重退化,几何证据变得模糊不清。为应对这一挑战,我们提出DarkVGGT,一种基于物理感知热建模的RGB-T前馈几何框架,用于在低光场景中实现鲁棒的三维估计。DarkVGGT引入两个互补模块:首先,物理启发的热分解模块提取以辐射为主导、几何一致性强的热线索,同时隔离可能引入几何模糊的稀疏反射残差;其次,几何共享的热路由模块从热特异性模式中分离模态不变的几何结构,选择性地将可靠性感知的结构引导注入RGB流。这些组件协同作用,使得在RGB退化条件下仍能实现精准的热辅助几何估计,同时在光照良好环境中保持高性能。在低可见度RGB-T基准数据集上的实验表明,该方法在深度估计和相机位姿估计上均较现有前馈几何基线方法取得一致性提升。