In this paper, we present a novel method for mobile manipulators to perform multiple contact-rich manipulation tasks. While learning-based methods have the potential to generate actions in an end-to-end manner, they often suffer from insufficient action accuracy and robustness against noise. On the other hand, classical control-based methods can enhance system robustness, but at the cost of extensive parameter tuning. To address these challenges, we present MOMA-Force, a visual-force imitation method that seamlessly combines representation learning for perception, imitation learning for complex motion generation, and admittance whole-body control for system robustness and controllability. MOMA-Force enables a mobile manipulator to learn multiple complex contact-rich tasks with high success rates and small contact forces. In a real household setting, our method outperforms baseline methods in terms of task success rates. Moreover, our method achieves smaller contact forces and smaller force variances compared to baseline methods without force imitation. Overall, we offer a promising approach for efficient and robust mobile manipulation in the real world. Videos and more details can be found on \url{https://visual-force-imitation.github.io}
翻译:本文提出一种面向移动操作机器人的新型方法,使其能够执行多种富含接触的复杂操作任务。基于学习的方法虽能实现端到端的动作生成,但往往存在动作精度不足且对噪声鲁棒性差的问题;而传统基于控制的方法虽可增强系统鲁棒性,却需耗费大量参数调优工作。为解决上述挑战,我们提出MOMA-Force——一种视觉-力觉模仿方法,该方法将感知表征学习、复杂动作生成的模仿学习以及用于系统鲁棒性与可控性的导纳全身控制无缝融合。MOMA-Force使移动操作机器人能够以高成功率和低接触力学习多种富含接触的复杂任务。在真实家居场景中,我们的方法在任务成功率上优于基线方法。此外,与未引入力觉模仿的基线方法相比,本方法可实现更小的接触力及更低的力方差。总体而言,我们为真实世界中高效鲁棒的移动操作提供了一种有前景的方案。视频及更多细节请访问\url{https://visual-force-imitation.github.io}。