With the rapid growth of computing powers and recent advances in deep learning, we have witnessed impressive demonstrations of novel robot capabilities in research settings. Nonetheless, these learning systems exhibit brittle generalization and require excessive training data for practical tasks. To harness the capabilities of state-of-the-art robot learning models while embracing their imperfections, we present Sirius, a principled framework for humans and robots to collaborate through a division of work. In this framework, partially autonomous robots are tasked with handling a major portion of decision-making where they work reliably; meanwhile, human operators monitor the process and intervene in challenging situations. Such a human-robot team ensures safe deployments in complex tasks. Further, we introduce a new learning algorithm to improve the policy's performance on the data collected from the task executions. The core idea is re-weighing training samples with approximated human trust and optimizing the policies with weighted behavioral cloning. We evaluate Sirius in simulation and on real hardware, showing that Sirius consistently outperforms baselines over a collection of contact-rich manipulation tasks, achieving an 8% boost in simulation and 27% on real hardware than the state-of-the-art methods, with twice faster convergence and 85% memory size reduction. Videos and code are available at https://ut-austin-rpl.github.io/sirius/
翻译:随着计算能力的快速增长和深度学习的最新进展,我们目睹了研究环境中机器人新颖能力的令人瞩目的展示。然而,这些学习系统在泛化能力上表现出脆弱性,且在实际任务中需要过量的训练数据。为了在利用最先进的机器人学习模型能力的同时接受其不完善之处,我们提出了Sirius框架,这是一个通过分工实现人与机器人协作的原则性框架。在该框架中,部分自主的机器人负责处理其能够可靠执行的大部分决策工作;同时,人类操作员监控整个过程并在复杂情境中干预。这样的人机团队确保了复杂任务中的安全部署。此外,我们引入了一种新的学习算法,以利用任务执行过程中收集的数据改进策略性能。其核心思想是通过近似的人类信任度对训练样本进行加权,并采用加权行为克隆优化策略。我们在仿真环境和真实硬件上评估了Sirius,结果显示,在系列接触丰富的操作任务中,Sirius始终优于基线方法,与最先进的方法相比,在仿真中取得了8%的性能提升,在真实硬件上提升了27%,同时收敛速度提高一倍,内存大小减少85%。视频和代码可在 https://ut-austin-rpl.github.io/sirius/ 获取。