It is crucial that users are empowered to use the functionalities of a robot to creatively solve problems on the fly. A user who has access to a Reinforcement Learning (RL) based robot may want to use the robot's autonomy and their knowledge of its behavior to complete new tasks. One way is for the user to take control of some of the robot's action space through teleoperation while the RL policy simultaneously controls the rest. However, an out-of-the-box RL policy may not readily facilitate this. For example, a user's control may bring the robot into a failure state from the policy's perspective, causing it to act in a way the user is not familiar with, hindering the success of the user's desired task. In this work, we formalize this problem and present Imaginary Out-of-Distribution Actions, IODA, an initial algorithm for addressing that problem and empowering user's to leverage their expectation of a robot's behavior to accomplish new tasks.
翻译:让用户能够自主利用机器人的功能创造性地解决即时问题至关重要。当用户使用基于强化学习的机器人时,可能希望借助机器人的自主性及其对行为模式的了解来完成新任务。一种实现方式是用户通过远程操作控制机器人部分动作空间,同时强化学习策略控制其余动作。然而,现成的强化学习策略可能难以直接支持这种协作。例如,用户的操作可能将机器人带入策略视角下的失败状态,导致机器人以用户不熟悉的方式行动,从而阻碍用户预期任务的顺利完成。本研究将这一问题形式化,并提出"想象分布外动作"(IODA)算法作为初步解决方案,旨在赋予用户利用对机器人行为的预期来执行新任务的能力。