Vision-based teleoperation offers the possibility to endow robots with human-level intelligence to physically interact with the environment, while only requiring low-cost camera sensors. However, current vision-based teleoperation systems are designed and engineered towards a particular robot model and deploy environment, which scales poorly as the pool of the robot models expands and the variety of the operating environment increases. In this paper, we propose AnyTeleop, a unified and general teleoperation system to support multiple different arms, hands, realities, and camera configurations within a single system. Although being designed to provide great flexibility to the choice of simulators and real hardware, our system can still achieve great performance. For real-world experiments, AnyTeleop can outperform a previous system that was designed for a specific robot hardware with a higher success rate, using the same robot. For teleoperation in simulation, AnyTeleop leads to better imitation learning performance, compared with a previous system that is particularly designed for that simulator. Project page: https://yzqin.github.io/anyteleop/.
翻译:基于视觉的遥操作技术有望赋予机器人具备人类级别的智能,使其能够通过低成本摄像头传感器与物理环境进行交互。然而,当前基于视觉的遥操作系统是针对特定机器人模型和部署环境而设计开发的,随着机器人模型种类增多和操作环境多样化,其可扩展性受到严重限制。本文提出AnyTeleop——一个统一且通用的遥操作系统,能够在单一系统中支持多种不同类型的机械臂、机械手、虚实场景及摄像头配置。尽管该系统旨在为仿真器和真实硬件选择提供高度灵活性,但其性能依然卓越。在真实世界实验中,采用相同机器人时,AnyTeleop相较于此前专为特定机器人硬件设计的系统,实现了更高的任务成功率。在仿真遥操作中,与专为某仿真器设计的先前系统相比,AnyTeleop带来了更优的模仿学习性能。项目页面:https://yzqin.github.io/anyteleop/。