We introduce pixelSplat, a feed-forward model that learns to reconstruct 3D radiance fields parameterized by 3D Gaussian primitives from pairs of images. Our model features real-time and memory-efficient rendering for scalable training as well as fast 3D reconstruction at inference time. To overcome local minima inherent to sparse and locally supported representations, we predict a dense probability distribution over 3D and sample Gaussian means from that probability distribution. We make this sampling operation differentiable via a reparameterization trick, allowing us to back-propagate gradients through the Gaussian splatting representation. We benchmark our method on wide-baseline novel view synthesis on the real-world RealEstate10k and ACID datasets, where we outperform state-of-the-art light field transformers and accelerate rendering by 2.5 orders of magnitude while reconstructing an interpretable and editable 3D radiance field.
翻译:我们提出 pixelSplat,一种前馈模型,该模型学习从图像对重建由三维高斯基元参数化的三维辐射场。我们的模型具有实时且内存高效的渲染能力,支持可扩展训练以及推理时的快速三维重建。为了克服稀疏且局部支持的表示所固有的局部极小值问题,我们预测三维空间上的密集概率分布,并从此概率分布中采样高斯均值。通过重参数化技巧使该采样操作可微,从而允许梯度通过高斯泼溅表示反向传播。我们在真实世界的 RealEstate10k 和 ACID 数据集上对大基线新视角合成任务进行了基准测试,在重建可解释、可编辑的三维辐射场的同时,其性能超越最先进的轻场变换器,并将渲染速度提升了约2.5个数量级。