We present a deep-dive into a real-world robotic learning system that, in previous work, was shown to be capable of hundreds of table tennis rallies with a human and has the ability to precisely return the ball to desired targets. This system puts together a highly optimized perception subsystem, a high-speed low-latency robot controller, a simulation paradigm that can prevent damage in the real world and also train policies for zero-shot transfer, and automated real world environment resets that enable autonomous training and evaluation on physical robots. We complement a complete system description, including numerous design decisions that are typically not widely disseminated, with a collection of studies that clarify the importance of mitigating various sources of latency, accounting for training and deployment distribution shifts, robustness of the perception system, sensitivity to policy hyper-parameters, and choice of action space. A video demonstrating the components of the system and details of experimental results can be found at https://youtu.be/uFcnWjB42I0.
翻译:我们深入探讨了一个现实世界中的机器人学习系统,该先前工作已证明能够与人类进行数百回合的乒乓球对打,并能精确地将球回击至目标位置。该系统集成了高度优化的感知子系统、高速低延迟机器人控制器、可防止真实环境损坏并训练零样本迁移策略的仿真范式,以及支持物理机器人自主训练与评估的自动化真实环境重置模块。我们不仅提供了完整的系统描述(涵盖诸多通常未广泛公开的设计决策),还通过系列研究阐明了缓解各类延迟、应对训练与部署分布偏移、感知系统鲁棒性、策略超参数敏感性及动作空间选择的重要性。展示系统组件及实验结果的视频可参见https://youtu.be/uFcnWjB42I0。