Existing approaches for human avatar generation--both NeRF-based and 3D Gaussian Splatting (3DGS) based--struggle with maintaining 3D consistency and exhibit degraded detail reconstruction, particularly when training with sparse inputs. To address this challenge, we propose CHASE, a novel framework that achieves dense-input-level performance using only sparse inputs through two key innovations: cross-pose intrinsic 3D consistency supervision and 3D geometry contrastive learning. Building upon prior skeleton-driven approaches that combine rigid deformation with non-rigid cloth dynamics, we first establish baseline avatars with fundamental 3D consistency. To enhance 3D consistency under sparse inputs, we introduce a Dynamic Avatar Adjustment (DAA) module, which refines deformed Gaussians by leveraging similar poses from the training set. By minimizing the rendering discrepancy between adjusted Gaussians and reference poses, DAA provides additional supervision for avatar reconstruction. We further maintain global 3D consistency through a novel geometry-aware contrastive learning strategy. While designed for sparse inputs, CHASE surpasses state-of-the-art methods across both full and sparse settings on ZJU-MoCap and H36M datasets, demonstrating that our enhanced 3D consistency leads to superior rendering quality.
翻译:现有的人体化身生成方法——无论是基于神经辐射场(NeRF)还是基于三维高斯溅射(3DGS)——在保持三维一致性方面存在困难,并且在细节重建上表现退化,尤其是在使用稀疏输入进行训练时。为应对这一挑战,我们提出了CHASE,一个仅通过稀疏输入即可达到密集输入级别性能的新型框架,其核心在于两项关键创新:跨姿态内在三维一致性监督与三维几何对比学习。在先前结合刚性形变与非刚性布料动力学的骨架驱动方法基础上,我们首先建立了具有基础三维一致性的基准化身。为增强稀疏输入下的三维一致性,我们引入了动态化身调整(DAA)模块,该模块通过利用训练集中相似姿态来优化形变后的高斯分布。通过最小化调整后高斯分布与参考姿态之间的渲染差异,DAA为化身重建提供了额外的监督信号。我们进一步通过一种新颖的几何感知对比学习策略来维持全局三维一致性。尽管专为稀疏输入设计,CHASE在ZJU-MoCap和H36M数据集上,无论是完整输入还是稀疏输入设置下,均超越了现有最优方法,这表明我们增强的三维一致性能够带来更优越的渲染质量。