This paper presents RoGSplat, a novel approach for synthesizing high-fidelity novel views of unseen human from sparse multi-view images, while requiring no cumbersome per-subject optimization. Unlike previous methods that typically struggle with sparse views with few overlappings and are less effective in reconstructing complex human geometry, the proposed method enables robust reconstruction in such challenging conditions. Our key idea is to lift SMPL vertices to dense and reliable 3D prior points representing accurate human body geometry, and then regress human Gaussian parameters based on the points. To account for possible misalignment between SMPL model and images, we propose to predict image-aligned 3D prior points by leveraging both pixel-level features and voxel-level features, from which we regress the coarse Gaussians. To enhance the ability to capture high-frequency details, we further render depth maps from the coarse 3D Gaussians to help regress fine-grained pixel-wise Gaussians. Experiments on several benchmark datasets demonstrate that our method outperforms state-of-the-art methods in novel view synthesis and cross-dataset generalization. Our code is available at https://github.com/iSEE-Laboratory/RoGSplat.
翻译:本文提出RoGSplat,一种从稀疏多视角图像合成未见人体高保真新视角的新方法,无需繁琐的逐对象优化。与以往方法通常在重叠区域极少的稀疏视角下表现不佳、且难以有效重建复杂人体几何不同,本方法能在如此挑战性条件下实现鲁棒重建。我们的核心思想是将SMPL顶点提升为表示精确人体几何的稠密可靠3D先验点,并基于这些点回归人体高斯参数。为处理SMPL模型与图像间可能存在的未对齐问题,我们提出通过融合像素级特征与体素级特征来预测图像对齐的3D先验点,并由此回归粗糙高斯分布。为增强高频细节捕捉能力,我们进一步从粗糙3D高斯分布渲染深度图,以辅助回归细粒度像素级高斯分布。在多个基准数据集上的实验表明,本方法在新视角合成与跨数据集泛化方面均优于现有最先进方法。代码发布于https://github.com/iSEE-Laboratory/RoGSplat。