OMEGA-Avatar: One-shot Modeling of 360° Gaussian Avatars

Creating high-fidelity, animatable 3D avatars from a single image remains a formidable challenge. We identified three desirable attributes of avatar generation: 1) the method should be feed-forward, 2) model a 360° full-head, and 3) should be animation-ready. However, current work addresses only two of the three points simultaneously. To address these limitations, we propose OMEGA-Avatar, the first feed-forward framework that simultaneously generates a generalizable, 360°-complete, and animatable 3D Gaussian head from a single image. Starting from a feed-forward and animatable framework, we address the 360° full-head avatar generation problem with two novel components. First, to overcome poor hair modeling in full-head avatar generation, we introduce a semantic-aware mesh deformation module that integrates multi-view normals to optimize a FLAME head with hair while preserving its topology structure. Second, to enable effective feed-forward decoding of full-head features, we propose a multi-view feature splatting module that constructs a shared canonical UV representation from features across multiple views through differentiable bilinear splatting, hierarchical UV mapping, and visibility-aware fusion. This approach preserves both global structural coherence and local high-frequency details across all viewpoints, ensuring 360° consistency without per-instance optimization. Extensive experiments demonstrate that OMEGA-Avatar achieves state-of-the-art performance, significantly outperforming existing baselines in 360° full-head completeness while robustly preserving identity across different viewpoints.

翻译：从单张图像创建高保真、可动画的3D化身仍然是一项艰巨的挑战。我们确定了化身生成应具备的三个理想属性：1) 方法应是前馈式的，2) 应能建模360°完整头部，以及3) 应具备动画就绪性。然而，现有工作通常只能同时满足其中两点。为克服这些局限，我们提出了OMEGA-Avatar，这是首个前馈式框架，能够从单张图像同时生成一个可泛化的、360°完整的、且可动画的3D高斯头部。从一个前馈且可动画的框架出发，我们通过两个新颖组件来解决360°完整头部化身的生成问题。首先，为克服完整头部化身生成中头发建模效果不佳的问题，我们引入了一个语义感知的网格变形模块，该模块整合多视角法线来优化一个带有头发的FLAME头部模型，同时保持其拓扑结构。其次，为实现对完整头部特征的有效前馈解码，我们提出了一个多视角特征光栅化模块，该模块通过可微分的双线性光栅化、分层UV映射和可见性感知融合，从多个视角的特征中构建一个共享的规范UV表示。这种方法在所有视角下既保持了全局结构连贯性，又保留了局部高频细节，确保了360°一致性，而无需进行逐实例优化。大量实验表明，OMEGA-Avatar实现了最先进的性能，在360°完整头部完整性方面显著优于现有基线方法，并能鲁棒地保持不同视角下的身份一致性。