Marker-based motion capture (MoCap) systems have long been the gold standard for accurate 4D human modeling, yet their reliance on specialized hardware and markers limits scalability and real-world deployment. Advancing reliable markerless 4D human motion capture requires datasets that reflect the complexity of real-world human interactions. Yet, existing benchmarks often lack realistic multi-person dynamics, severe occlusions, and challenging interaction patterns, leading to a persistent domain gap. In this work, we present a new dataset and evaluation for complex 4D markerless human motion capture. Our proposed MoCap dataset captures both single and multi-person scenarios with intricate motions, frequent inter-person occlusions, rapid position exchanges between similarly dressed subjects, and varying subject distances. It includes synchronized multi-view RGB and depth sequences, accurate camera calibration, ground-truth 3D motion capture from a Vicon system, and corresponding SMPL/SMPL-X parameters. This setup ensures precise alignment between visual observations and motion ground truth. Benchmarking state-of-the-art markerless MoCap models reveals substantial performance degradation under these realistic conditions, highlighting limitations of current approaches. We further demonstrate that targeted fine-tuning improves generalization, validating the dataset's realism and value for model development. Our evaluation exposes critical gaps in existing models and provides a rigorous foundation for advancing robust markerless 4D human motion capture.
翻译:基于标记的运动捕捉系统长期以来一直是精确4D人体建模的黄金标准,但其对专用硬件和标记物的依赖限制了可扩展性和实际场景部署。推动可靠的无标记4D人体运动捕捉发展,需要能够反映真实世界人体交互复杂性的数据集。然而,现有基准测试通常缺乏真实的多人体动态、严重遮挡和复杂交互模式,导致持续的领域差距。本文提出了一种针对复杂4D无标记人体运动捕捉的数据集与评估方案。所提出的MoCap数据集涵盖了单人和多人场景,包含复杂动作、频繁的人物间遮挡、着装相似对象之间的快速位置交换以及不同的对象间距。该数据集包含同步的多视角RGB与深度序列、精确的相机标定、来自Vicon系统的真实3D运动捕捉数据,以及对应的SMPL/SMPL-X参数。这种配置确保了视觉观测与运动真值之间的精确对齐。对现有最优无标记运动捕捉模型的基准测试表明,在真实场景下模型性能显著下降,揭示了当前方法的局限性。我们进一步证明,针对性的微调能够提升泛化能力,验证了该数据集在模型开发中的真实性与价值。我们的评估揭示了现有模型的关键缺陷,为推动鲁棒的无标记4D人体运动捕捉提供了严谨的基础。