Recent feed-forward Gaussian reconstruction models adopt a pixel-aligned formulation that maps each 2D pixel to a 3D Gaussian, entangling Gaussian representations tightly with the input images. In this paper, we propose AnchorSplat, a novel feed-forward 3DGS framework for scene-level reconstruction that represents the scene directly in 3D space. AnchorSplat introduces an anchor-aligned Gaussian representation guided by 3D geometric priors (e.g., sparse point clouds, voxels, or RGB-D point clouds), enabling a more geometry-aware renderable 3D Gaussians that is independent of image resolution and number of views. This design substantially reduces the number of required Gaussians, improving computational efficiency while enhancing reconstruction fidelity. Beyond the anchor-aligned design, we utilize a Gaussian Refiner to adjust the intermediate Gaussiansy via merely a few forward passes. Experiments on the ScanNet++ v2 NVS benchmark demonstrate the SOTA performance, outperforming previous methods with more view-consistent and substantially fewer Gaussian primitives.
翻译:近期前馈式高斯重建模型采用像素对齐公式,将每个二维像素映射至三维高斯体,使得高斯表示与输入图像紧密耦合。本文提出AnchorSplat——一种面向场景级重建的新型前馈式3DGS框架,该框架直接在三维空间中对场景进行表征。AnchorSplat引入基于三维几何先验(如稀疏点云、体素或RGB-D点云)引导的锚点对齐高斯表示,从而生成更具几何感知能力的可渲染三维高斯体,该表示与图像分辨率及视图数量无关。该设计大幅降低了所需高斯体数量,在提升计算效率的同时增强了重建保真度。除锚点对齐设计外,我们还利用高斯精炼器通过少量前向传递对中间高斯体进行调整。在ScanNet++ v2新视角合成基准测试中的实验表明,本方法实现了最先进的性能,相较于先前方法具有更强的视图一致性且所需高斯基元数量显著更少。