In this paper, we introduce RealDex, a pioneering dataset capturing authentic dexterous hand grasping motions infused with human behavioral patterns, enriched by multi-view and multimodal visual data. Utilizing a teleoperation system, we seamlessly synchronize human-robot hand poses in real time. This collection of human-like motions is crucial for training dexterous hands to mimic human movements more naturally and precisely. RealDex holds immense promise in advancing humanoid robot for automated perception, cognition, and manipulation in real-world scenarios. Moreover, we introduce a cutting-edge dexterous grasping motion generation framework, which aligns with human experience and enhances real-world applicability through effectively utilizing Multimodal Large Language Models. Extensive experiments have demonstrated the superior performance of our method on RealDex and other open datasets. The complete dataset and code will be made available upon the publication of this work.
翻译:本文介绍RealDex——一个开创性的数据集,它通过多视角多模态视觉数据,捕捉了融合人类行为特征的真实灵巧手抓取动作。借助遥操作系统,我们实现了人手机器人姿态的实时同步。这类拟人化动作集合对于训练灵巧手更自然、更精确地模仿人类运动至关重要。RealDex在推动仿人机器人于真实场景中实现自动化感知、认知与操作方面具有巨大潜力。此外,我们提出了一种先进的灵巧抓取动作生成框架,该框架通过有效利用多模态大语言模型,使其符合人类经验并增强现实世界的适用性。大量实验证明,我们的方法在RealDex及其他公开数据集上均表现出优越性能。完整数据集与代码将在本文发表时公开。