In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly autonomously in solving a task, but can request help from an external expert when needed. However, knowing when to request such assistance is critical: too few requests can lead to the robot making mistakes, but too many requests can overload the expert. In this paper, we present a Reinforcement Learning based approach to this problem, where a semi-autonomous agent asks for external assistance when it has low confidence in the eventual success of the task. The confidence level is computed by estimating the variance of the return from the current state. We show that this estimate can be iteratively improved during training using a Bellman-like recursion. On discrete navigation problems with both fully- and partially-observable state information, we show that our method makes effective use of a limited budget of expert calls at run-time, despite having no access to the expert at training time.
翻译:在人在回路范式中,机器人代理能够主要自主地完成任务,但在需要时可以向外部专家请求帮助。然而,知道何时请求此类帮助至关重要:请求过少可能导致机器人犯错,而请求过多可能使专家负担过重。在本文中,我们提出了一种基于强化学习的方法来解决这一问题,其中半自主代理在对于任务最终成功缺乏信心时请求外部协助。置信度通过估计当前状态返回值的方差来计算。我们表明,在训练过程中,可以使用类似贝尔曼递归的方式迭代改进这一估计。在具有完全可观测和部分可观测状态信息的离散导航问题中,我们证明了该方法在运行时能够有效利用有限的专家调用预算,尽管在训练时无法访问专家。