Accurate 3D reconstruction of deformable soft tissues is essential for surgical robotic perception. However, low-texture surfaces, specular highlights, and instrument occlusions often fragment geometric continuity, posing a challenge for existing fixed-topology approaches. To address this, we propose EndoVGGT, a geometry-centric framework equipped with a Deformation-aware Graph Attention (DeGAT) module. Rather than using static spatial neighborhoods, DeGAT dynamically constructs feature-space semantic graphs to capture long-range correlations among coherent tissue regions. This enables robust propagation of structural cues across occlusions, enforcing global consistency and improving non-rigid deformation recovery. Extensive experiments on SCARED show that our method significantly improves fidelity, increasing PSNR by 24.6% and SSIM by 9.1% over prior state-of-the-art. Crucially, EndoVGGT exhibits strong zero-shot cross-dataset generalization to the unseen SCARED and EndoNeRF domains, confirming that DeGAT learns domain-agnostic geometric priors. These results highlight the efficacy of dynamic feature-space modeling for consistent surgical 3D reconstruction.
翻译:可变形软组织的精确三维重建是手术机器人感知的关键。然而,低纹理表面、镜面高光以及器械遮挡常导致几何连续性断裂,对现有固定拓扑方法构成挑战。为此,我们提出EndoVGGT——一种配备形变感知图注意力(DeGAT)模块的几何中心框架。DeGAT并非采用静态空间邻域,而是动态构建特征空间语义图,以捕获连贯组织区域间的长程相关性。这使得结构线索能够跨遮挡区域稳健传播,强化全局一致性并提升非刚性形变恢复效果。在SCARED数据集上的大量实验表明:我们的方法显著提升了保真度,相比先前最优方法,峰值信噪比(PSNR)提升24.6%,结构相似性(SSIM)提升9.1%。关键的是,EndoVGGT展现出强大的零样本跨数据集泛化能力,对未见过的SCARED和EndoNeRF域均可适应,证实DeGAT习得了领域无关的几何先验。这些结果凸显了动态特征空间建模在实现一致性手术三维重建中的有效性。