In this paper, we tackle the problem of how to build and benchmark a large motion model (LMM). The ultimate goal of LMM is to serve as a foundation model for versatile motion-related tasks, e.g., human motion generation, with interpretability and generalizability. Though advanced, recent LMM-related works are still limited by small-scale motion data and costly text descriptions. Besides, previous motion benchmarks primarily focus on pure body movements, neglecting the ubiquitous motions in context, i.e., humans interacting with humans, objects, and scenes. To address these limitations, we consolidate large-scale video action datasets as knowledge banks to build MotionBank, which comprises 13 video action datasets, 1.24M motion sequences, and 132.9M frames of natural and diverse human motions. Different from laboratory-captured motions, in-the-wild human-centric videos contain abundant motions in context. To facilitate better motion text alignment, we also meticulously devise a motion caption generation algorithm to automatically produce rule-based, unbiased, and disentangled text descriptions via the kinematic characteristics for each motion. Extensive experiments show that our MotionBank is beneficial for general motion-related tasks of human motion generation, motion in-context generation, and motion understanding. Video motions together with the rule-based text annotations could serve as an efficient alternative for larger LMMs. Our dataset, codes, and benchmark will be publicly available at https://github.com/liangxuy/MotionBank.
翻译:本文旨在解决如何构建并评估大规模运动模型的问题。LMM 的最终目标是作为多功能运动相关任务(例如人体运动生成)的基础模型,并具备可解释性和泛化能力。尽管近期 LMM 相关研究已取得进展,但仍受限于小规模运动数据及高成本的文本描述。此外,现有运动基准主要关注纯肢体动作,忽视了普遍存在的上下文运动,即人与他人、物体及场景的交互。为突破这些限制,我们整合大规模视频动作数据集作为知识库,构建了 MotionBank。该数据集包含 13 个视频动作数据集、124 万条运动序列及 1.329 亿帧自然多样的人体运动数据。与实验室采集的运动不同,真实场景中以人为中心的视频蕴含丰富的上下文运动信息。为提升运动与文本的对齐质量,我们精心设计了一套运动描述生成算法,可根据运动学特征为每条运动自动生成基于规则的、无偏见的、解耦的文本描述。大量实验表明,MotionBank 对人体运动生成、上下文运动生成及运动理解等通用运动任务具有显著增益。视频运动数据与基于规则的文本标注相结合,可为更大型 LMM 提供高效的训练替代方案。我们的数据集、代码及基准测试工具将在 https://github.com/liangxuy/MotionBank 公开提供。