When faced with accomplishing a task, human experts exhibit intentional behavior. Their unique intents shape their plans and decisions, resulting in experts demonstrating diverse behaviors to accomplish the same task. Due to the uncertainties encountered in the real world and their bounded rationality, experts sometimes adjust their intents, which in turn influences their behaviors during task execution. This paper introduces IDIL, a novel imitation learning algorithm to mimic these diverse intent-driven behaviors of experts. Iteratively, our approach estimates expert intent from heterogeneous demonstrations and then uses it to learn an intent-aware model of their behavior. Unlike contemporary approaches, IDIL is capable of addressing sequential tasks with high-dimensional state representations, while sidestepping the complexities and drawbacks associated with adversarial training (a mainstay of related techniques). Our empirical results suggest that the models generated by IDIL either match or surpass those produced by recent imitation learning benchmarks in metrics of task performance. Moreover, as it creates a generative model, IDIL demonstrates superior performance in intent inference metrics, crucial for human-agent interactions, and aptly captures a broad spectrum of expert behaviors.
翻译:人类专家在完成任务时展现出有意图的行为。其独特的意图塑造了计划与决策,导致不同专家完成同一任务时表现出多样化行为。由于现实世界中的不确定性及有限理性,专家有时会调整自身意图,进而影响任务执行中的行为。本文提出IDIL——一种模仿学习算法,旨在模仿专家的意图驱动多样化行为。该方法通过迭代方式从异质演示中估计专家意图,进而利用该意图学习其行为的意图感知模型。与现有方法不同,IDIL能够处理具有高维状态表示的序列任务,同时规避对抗训练(相关技术的主要手段)的复杂性与弊端。实验结果表明,IDIL生成的模型在任务性能指标上达到或超越近期模仿学习基准。此外,由于构建了生成式模型,IDIL在意图推断指标上表现更优(这对人机交互至关重要),并能准确捕捉专家行为的广泛分布。