AniGen: Unified $S^3$ Fields for Animatable 3D Asset Generation

Animatable 3D assets, defined as geometry equipped with an articulated skeleton and skinning weights, are fundamental to interactive graphics, embodied agents, and animation production. While recent 3D generative models can synthesize visually plausible shapes from images, the results are typically static. Obtaining usable rigs via post-hoc auto-rigging is brittle and often produces skeletons that are topologically inconsistent with the generated geometry. We present AniGen, a unified framework that directly generates animate-ready 3D assets conditioned on a single image. Our key insight is to represent shape, skeleton, and skinning as mutually consistent $S^3$ Fields (Shape, Skeleton, Skin) defined over a shared spatial domain. To enable the robust learning of these fields, we introduce two technical innovations: (i) a confidence-decaying skeleton field that explicitly handles the geometric ambiguity of bone prediction at Voronoi boundaries, and (ii) a dual skin feature field that decouples skinning weights from specific joint counts, allowing a fixed-architecture network to predict rigs of arbitrary complexity. Built upon a two-stage flow-matching pipeline, AniGen first synthesizes a sparse structural scaffold and then generates dense geometry and articulation in a structured latent space. Extensive experiments demonstrate that AniGen substantially outperforms state-of-the-art sequential baselines in rig validity and animation quality, generalizing effectively to in-the-wild images across diverse categories including animals, humanoids, and machinery. Homepage: https://yihua7.github.io/AniGen-web/

翻译：可动画化3D资产（定义为配备有骨架与蒙皮权重的几何形体）是交互式图形学、具身智能体及动画制作的基础。尽管近期3D生成模型能从图像合成视觉上合理的形状，但其结果通常为静态。通过事后自动绑定获取可用骨架的方法较为脆弱，且往往产生与生成几何拓扑不一致的骨架。我们提出AniGen——一个以单张图像为条件、直接生成可动画化3D资产的统一框架。核心见解在于将形状、骨架与蒙皮表示为定义在共享空间域上、相互一致的$S^3$场（形状Shape、骨架Skeleton、蒙皮Skin）。为实现对此类场的鲁棒学习，我们提出两项技术创新：(i) 置信度衰减骨架场，显式处理Voronoi边界处骨骼预测的几何模糊性；(ii) 双蒙皮特征场，将蒙皮权重与特定关节数量解耦，使固定架构网络能预测任意复杂度的绑定结构。基于两阶段流匹配管线，AniGen首先生成稀疏结构骨架，随后在结构化隐空间内生成密集几何与关节结构。大量实验表明，AniGen在绑定有效性与动画质量上显著超越最先进的顺序基线方法，并能有效泛化至包括动物、人形及机械装置在内的各类野外图像。主页：https://yihua7.github.io/AniGen-web/