We present ReactionMamba, a novel framework for generating long 3D human reaction motions. Reaction-Mamba integrates a motion VAE for efficient motion encoding with Mamba-based state-space models to decode temporally consistent reactions. This design enables ReactionMamba to generate both short sequences of simple motions and long sequences of complex motions, such as dance and martial arts. We evaluate ReactionMamba on three datasets--NTU120-AS, Lindy Hop, and InterX--and demonstrate competitive performance in terms of realism, diversity, and long-sequence generation compared to previous methods, including InterFormer, ReMoS, and Ready-to-React, while achieving substantial improvements in inference speed.
翻译:我们提出了ReactionMamba,一个用于生成长时3D人体反应动作的新颖框架。ReactionMamba集成了一个用于高效动作编码的运动VAE,以及基于Mamba的状态空间模型来解码时序一致的反应动作。该设计使得ReactionMamba能够生成简单的短时动作序列以及复杂的长时动作序列,例如舞蹈和武术。我们在三个数据集——NTU120-AS、Lindy Hop和InterX——上评估了ReactionMamba,并展示了其在真实性、多样性和长序列生成方面相较于先前方法(包括InterFormer、ReMoS和Ready-to-React)的竞争性能,同时在推理速度上实现了显著提升。