In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly autonomously in solving a task, but can request help from an external expert when needed. However, knowing when to request such assistance is critical: too few requests can lead to the robot making mistakes, but too many requests can overload the expert. In this paper, we present a Reinforcement Learning based approach to this problem, where a semi-autonomous agent asks for external assistance when it has low confidence in the eventual success of the task. The confidence level is computed by estimating the variance of the return from the current state. We show that this estimate can be iteratively improved during training using a Bellman-like recursion. On discrete navigation problems with both fully- and partially-observable state information, we show that our method makes effective use of a limited budget of expert calls at run-time, despite having no access to the expert at training time.
翻译:在人机协同范式中,机器人主体能够在大部分情况下自主完成任务,但在必要时可向外部专家请求帮助。然而,如何判断何时请求协助至关重要:请求过少会导致机器人犯错,请求过多则会增加专家负担。本文提出一种基于强化学习的解决方案,该方案使半自主主体在无法确信任务最终成功时主动请求外部协助。通过估计当前状态回报期望的方差计算置信度水平,我们证明该估计值可通过类贝尔曼递归在训练过程中迭代优化。在完全可观测与部分可观测状态信息的离散导航问题中,尽管训练阶段无法接触专家,我们的方法仍能在运行时对有限专家调用预算实现高效利用。