Real-time optical Motion Capture (MoCap) systems have not benefited from the advances in modern data-driven modeling. In this work we apply machine learning to solve noisy unstructured marker estimates in real-time and deliver robust marker-based MoCap even when using sparse affordable sensors. To achieve this we focus on a number of challenges related to model training, namely the sourcing of training data and their long-tailed distribution. Leveraging representation learning we design a technique for imbalanced regression that requires no additional data or labels and improves the performance of our model in rare and challenging poses. By relying on a unified representation, we show that training such a model is not bound to high-end MoCap training data acquisition, and exploit the advances in marker-less MoCap to acquire the necessary data. Finally, we take a step towards richer and affordable MoCap by adapting a body model-based inverse kinematics solution to account for measurement and inference uncertainty, further improving performance and robustness. Project page: https://moverseai.github.io/noise-tail
翻译:实时光学动作捕捉(MoCap)系统尚未从现代数据驱动建模的进展中受益。在本工作中,我们应用机器学习技术实时求解带噪声的非结构化标记估计结果,即使在使用稀疏且成本较低的传感器时,也能实现鲁棒的基于标记的动作捕捉。为此,我们聚焦于模型训练相关的若干挑战,即训练数据的来源及其长尾分布问题。借助表征学习,我们设计了一种无需额外数据或标签的非平衡回归技术,该技术提升了模型在罕见且具挑战性姿态下的性能。通过依赖统一表征,我们证明此类模型的训练并不局限于高端动作捕捉训练数据的采集,并可利用无标记动作捕捉的进展来获取所需数据。最后,我们通过调整基于人体模型的逆运动学解算方法以考虑测量与推断的不确定性,进一步提升了性能与鲁棒性,向着更丰富且成本更低的动作捕捉迈进一步。项目页面:https://moverseai.github.io/noise-tail