Generalizing motion representation across diverse characters remains challenging due to significant topological variations in skeletal structures across datasets and species, which hinder the development of scalable generative models. To bridge this gap, we propose a Semantic-Aware Topology-Agnostic framework that learns a unified latent manifold shared by disparate species. Unlike methods relying on fixed hierarchies or rigid padding strategies, our approach leverages a semantic modulation mechanism to align functional joint correspondences, thereby decoupling motion from topology. This design enables the construction of a continuous, generative-friendly motion space from large-scale, unaligned raw BVH data. Experiments on human and animal datasets demonstrate that our framework achieves high-fidelity reconstruction and supports downstream text-to-motion tasks. Notably, the model enables zero-shot cross-species retargeting without paired data. Code and demos are available at: https://github.com/zzysteve/SATA
翻译:通用化跨不同角色的运动表示仍然具有挑战性,原因在于数据集和物种之间骨骼结构的显著拓扑差异阻碍了可扩展生成模型的发展。为弥合这一差距,我们提出了一种语义感知的拓扑无关框架,该框架学习一个由不同物种共享的统一潜在流形。与依赖固定层级结构或刚性填充策略的方法不同,我们的方法利用语义调制机制来对齐功能性的关节对应关系,从而将运动与拓扑解耦。这种设计能够从大规模、未对齐的原始BVH数据中构建一个连续且适合生成的运动空间。在人类和动物数据集上的实验表明,我们的框架实现了高保真重构,并支持下游文本到运动的任务。值得注意的是,该模型实现了零样本跨物种重定向,无需成对数据。代码和演示可在以下网址获取:https://github.com/zzysteve/SATA