It is especially challenging to achieve real-time human motion tracking on a standalone VR Head-Mounted Display (HMD) such as Meta Quest and PICO. In this paper, we propose HMD-Poser, the first unified approach to recover full-body motions using scalable sparse observations from HMD and body-worn IMUs. In particular, it can support a variety of input scenarios, such as HMD, HMD+2IMUs, HMD+3IMUs, etc. The scalability of inputs may accommodate users' choices for both high tracking accuracy and easy-to-wear. A lightweight temporal-spatial feature learning network is proposed in HMD-Poser to guarantee that the model runs in real-time on HMDs. Furthermore, HMD-Poser presents online body shape estimation to improve the position accuracy of body joints. Extensive experimental results on the challenging AMASS dataset show that HMD-Poser achieves new state-of-the-art results in both accuracy and real-time performance. We also build a new free-dancing motion dataset to evaluate HMD-Poser's on-device performance and investigate the performance gap between synthetic data and real-captured sensor data. Finally, we demonstrate our HMD-Poser with a real-time Avatar-driving application on a commercial HMD. Our code and free-dancing motion dataset are available https://pico-ai-team.github.io/hmd-poser
翻译:在独立式虚拟现实头戴显示器(HMD,如Meta Quest和PICO)上实现实时人体运动追踪尤为具有挑战性。本文提出HMD-Poser,这是首个利用HMD和穿戴式惯性测量单元(IMU)的可扩展稀疏观测来恢复全身运动的统一方法。具体而言,该方法能够支持多种输入场景,例如仅使用HMD、HMD+2个IMU、HMD+3个IMU等。输入的可扩展性可兼顾用户对高追踪精度与轻便穿戴性的不同需求。为保障模型在HMD设备上实时运行,HMD-Poser中设计了一种轻量化的时-空特征学习网络。此外,HMD-Poser引入在线体型估计算法以提升各身体关节的位置精度。在具有挑战性的AMASS数据集上的大量实验表明,HMD-Poser在精度和实时性能方面均达到了新的最优水平。我们还构建了一个新的自由舞蹈运动数据集,用于评估HMD-Poser的设备端性能,并探究合成数据与实际采集传感器数据之间的性能差异。最后,我们在商用HMD上通过实时虚拟化身驱动应用展示了HMD-Poser的效果。代码与自由舞蹈运动数据集已开源:https://pico-ai-team.github.io/hmd-poser