With the rapid growth of computing powers and recent advances in deep learning, we have witnessed impressive demonstrations of novel robot capabilities in research settings. Nonetheless, these learning systems exhibit brittle generalization and require excessive training data for practical tasks. To harness the capabilities of state-of-the-art robot learning models while embracing their imperfections, we present Sirius, a principled framework for humans and robots to collaborate through a division of work. In this framework, partially autonomous robots are tasked with handling a major portion of decision-making where they work reliably; meanwhile, human operators monitor the process and intervene in challenging situations. Such a human-robot team ensures safe deployments in complex tasks. Further, we introduce a new learning algorithm to improve the policy's performance on the data collected from the task executions. The core idea is re-weighing training samples with approximated human trust and optimizing the policies with weighted behavioral cloning. We evaluate Sirius in simulation and on real hardware, showing that Sirius consistently outperforms baselines over a collection of contact-rich manipulation tasks, achieving an 8% boost in simulation and 27% on real hardware than the state-of-the-art methods in policy success rate, with twice faster convergence and 85% memory size reduction. Videos and more details are available at https://ut-austin-rpl.github.io/sirius/
翻译:随着计算能力的快速增长和深度学习的近期进展,我们见证了研究环境中新型机器人能力的令人印象深刻展示。然而,这些学习系统表现出脆弱的泛化能力,并且对于实际任务需要过多的训练数据。为了利用最先进的机器人学习模型的能力,同时接纳其不完美之处,我们提出了Sirius,一个通过分工实现人机协作的原则性框架。在该框架中,部分自主的机器人负责处理它们能够可靠工作的大部分决策任务;与此同时,人类操作员监控过程并在复杂情况下进行干预。这种人机团队确保了复杂任务中的安全部署。此外,我们引入了一种新的学习算法,利用任务执行过程中收集的数据来改进策略性能。核心思想是用近似的人类信任度对训练样本重新加权,并通过加权行为克隆优化策略。我们在仿真和真实硬件上对Sirius进行了评估,结果表明,在一系列接触丰富的操作任务中,Sirius始终优于基线方法,与最先进的方法相比,策略成功率在仿真中提升了8%,在真实硬件上提升了27%,同时收敛速度提高了两倍,内存大小减少了85%。更多视频和详细信息请见https://ut-austin-rpl.github.io/sirius/