Partial Automation (PA) with intelligent support systems has been introduced in industrial machinery and advanced automobiles to reduce the burden of long hours of human operation. Under PA, operators perform manual operations (providing actions) and operations that switch to automatic/manual mode (mode-switching). Since PA reduces the total duration of manual operation, these two action and mode-switching operations can be replicated by imitation learning with high sample efficiency. To this end, this paper proposes Disturbance Injection under Partial Automation (DIPA) as a novel imitation learning framework. In DIPA, mode and actions (in the manual mode) are assumed to be observables in each state and are used to learn both action and mode-switching policies. The above learning is robustified by injecting disturbances into the operator's actions to optimize the disturbance's level for minimizing the covariate shift under PA. We experimentally validated the effectiveness of our method for long-horizon tasks in two simulations and a real robot environment and confirmed that our method outperformed the previous methods and reduced the demonstration burden.
翻译:部分自动化(PA)通过引入智能支持系统,在工业机械和高级汽车中得以应用,以减轻人类长时间操作的负担。在PA模式下,操作员需执行手动操作(提供动作)以及切换至自动/手动模式的操作(模式切换)。由于PA减少了手动操作的总时长,这两种动作与模式切换操作可通过高采样效率的模仿学习进行复现。为此,本文提出"部分自动化下的干扰注入"(DIPA)作为一种新型模仿学习框架。在DIPA中,手动模式下的模式与动作被假定为每个状态下的可观测变量,并用于同时学习动作策略与模式切换策略。上述学习过程通过向操作员动作注入干扰实现鲁棒优化,以最小化PA下协变量偏移的干扰水平。我们在两项仿真实验及真实机器人环境中验证了该方法在长时域任务中的有效性,结果表明该方法优于先前技术,并减轻了示教负担。