Generalizable 3D Gaussian Splatting aims to directly predict Gaussian parameters using a feed-forward network for scene reconstruction. Among these parameters, Gaussian means are particularly difficult to predict, so depth is usually estimated first and then unprojected to obtain the Gaussian sphere centers. Existing methods typically rely solely on a single warp to estimate depth probability, which hinders their ability to fully leverage cross-view geometric cues, resulting in unstable and coarse depth maps. To address this limitation, we propose IDESplat, which iteratively applies warp operations to boost depth probability estimation for accurate Gaussian mean prediction. First, to eliminate the inherent instability of a single warp, we introduce a Depth Probability Boosting Unit (DPBU) that integrates epipolar attention maps produced by cascading warp operations in a multiplicative manner. Next, we construct an iterative depth estimation process by stacking multiple DPBUs, progressively identifying potential depth candidates with high likelihood. As IDESplat iteratively boosts depth probability estimates and updates the depth candidates, the depth map is gradually refined, resulting in accurate Gaussian means. We conduct experiments on RealEstate10K, ACID, and DL3DV. IDESplat achieves outstanding reconstruction quality and state-of-the-art performance with real-time efficiency. On RE10K, it outperforms DepthSplat by 0.33 dB in PSNR, using only 10.7% of the parameters and 70% of the memory. Additionally, our IDESplat improves PSNR by 2.95 dB over DepthSplat on the DTU dataset in cross-dataset experiments, demonstrating its strong generalization ability.
翻译:可泛化三维高斯泼溅旨在通过前馈网络直接预测高斯参数以实现场景重建。在这些参数中,高斯均值尤其难以预测,因此通常先估计深度,再通过反投影获得高斯球心。现有方法通常仅依赖单次扭曲操作来估计深度概率,这限制了其充分利用跨视角几何线索的能力,导致生成不稳定且粗糙的深度图。为解决这一局限,我们提出IDESplat,该方法通过迭代应用扭曲操作来增强深度概率估计,从而实现精确的高斯均值预测。首先,为消除单次扭曲固有的不稳定性,我们引入深度概率增强单元,该单元以乘积方式集成由级联扭曲操作产生的极线注意力图。接着,我们通过堆叠多个深度概率增强单元构建迭代深度估计过程,逐步筛选出高似然度的潜在深度候选值。随着IDESplat迭代增强深度概率估计并更新深度候选值,深度图得以逐步优化,最终获得精确的高斯均值。我们在RealEstate10K、ACID和DL3DV数据集上进行了实验。IDESplat在保持实时效率的同时,实现了卓越的重建质量与最先进的性能。在RE10K数据集上,其PSNR指标超越DepthSplat 0.33 dB,而参数量仅需后者的10.7%,内存占用降低30%。此外,在跨数据集实验中,我们的方法在DTU数据集上将PSNR提升了2.95 dB,充分证明了其强大的泛化能力。