A Dataset and Evaluation for Complex 4D Markerless Human Motion Capture

Marker-based motion capture (MoCap) systems have long been the gold standard for accurate 4D human modeling, yet their reliance on specialized hardware and markers limits scalability and real-world deployment. Advancing reliable markerless 4D human motion capture requires datasets that reflect the complexity of real-world human interactions. Yet, existing benchmarks often lack realistic multi-person dynamics, severe occlusions, and challenging interaction patterns, leading to a persistent domain gap. In this work, we present a new dataset and evaluation for complex 4D markerless human motion capture. Our proposed MoCap dataset captures both single and multi-person scenarios with intricate motions, frequent inter-person occlusions, rapid position exchanges between similarly dressed subjects, and varying subject distances. It includes synchronized multi-view RGB and depth sequences, accurate camera calibration, ground-truth 3D motion capture from a Vicon system, and corresponding SMPL/SMPL-X parameters. This setup ensures precise alignment between visual observations and motion ground truth. Benchmarking state-of-the-art markerless MoCap models reveals substantial performance degradation under these realistic conditions, highlighting limitations of current approaches. We further demonstrate that targeted fine-tuning improves generalization, validating the dataset's realism and value for model development. Our evaluation exposes critical gaps in existing models and provides a rigorous foundation for advancing robust markerless 4D human motion capture.

翻译：基于标记的运动捕捉系统长期以来一直是精确4D人体建模的黄金标准，但其对专用硬件和标记物的依赖限制了可扩展性和实际场景部署。推动可靠的无标记4D人体运动捕捉发展，需要能够反映真实世界人体交互复杂性的数据集。然而，现有基准测试通常缺乏真实的多人体动态、严重遮挡和复杂交互模式，导致持续的领域差距。本文提出了一种针对复杂4D无标记人体运动捕捉的数据集与评估方案。所提出的MoCap数据集涵盖了单人和多人场景，包含复杂动作、频繁的人物间遮挡、着装相似对象之间的快速位置交换以及不同的对象间距。该数据集包含同步的多视角RGB与深度序列、精确的相机标定、来自Vicon系统的真实3D运动捕捉数据，以及对应的SMPL/SMPL-X参数。这种配置确保了视觉观测与运动真值之间的精确对齐。对现有最优无标记运动捕捉模型的基准测试表明，在真实场景下模型性能显著下降，揭示了当前方法的局限性。我们进一步证明，针对性的微调能够提升泛化能力，验证了该数据集在模型开发中的真实性与价值。我们的评估揭示了现有模型的关键缺陷，为推动鲁棒的无标记4D人体运动捕捉提供了严谨的基础。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CMU博士论文】交互驱动的人体动作估计与生成

专知会员服务

18+阅读 · 2025年9月17日

《基于低帧率无人机视频自监督学习的军事车辆实时追踪系统》最新论文

专知会员服务

20+阅读 · 2025年7月15日

【CVPR2025】MixerMDM：可学习的人体运动扩散模型组合

专知会员服务

10+阅读 · 2025年4月3日

【博士论文】ࣞ动态三维人体的隐式神经表示方法研究

专知会员服务

18+阅读 · 2024年11月22日