With the fast improvement of machine learning, reinforcement learning (RL) has been used to automate human tasks in different areas. However, training such agents is difficult and restricted to expert users. Moreover, it is mostly limited to simulation environments due to the high cost and safety concerns of interactions in the real world. Demonstration Learning is a paradigm in which an agent learns to perform a task by imitating the behavior of an expert shown in demonstrations. It is a relatively recent area in machine learning, but it is gaining significant traction due to having tremendous potential for learning complex behaviors from demonstrations. Learning from demonstration accelerates the learning process by improving sample efficiency, while also reducing the effort of the programmer. Due to learning without interacting with the environment, demonstration learning would allow the automation of a wide range of real world applications such as robotics and healthcare. This paper provides a survey of demonstration learning, where we formally introduce the demonstration problem along with its main challenges and provide a comprehensive overview of the process of learning from demonstrations from the creation of the demonstration data set, to learning methods from demonstrations, and optimization by combining demonstration learning with different machine learning methods. We also review the existing benchmarks and identify their strengths and limitations. Additionally, we discuss the advantages and disadvantages of the paradigm as well as its main applications. Lastly, we discuss our perspective on open problems and research directions for this rapidly growing field.
翻译:随着机器学习的快速发展,强化学习已被用于自动化不同领域的人类任务。然而,训练此类智能体较为困难且仅限于专家用户。此外,由于现实世界交互的高成本和安全问题,其应用大多局限于仿真环境。演示学习是一种范式,智能体通过模仿演示中专家展示的行为来学习执行任务。这是机器学习中一个相对较新的领域,但由于其在从演示中学习复杂行为方面具有巨大潜力而日益受到重视。从演示中学习通过提高样本效率加速学习过程,同时减少程序员的编程工作量。由于无需与环境交互即可学习,演示学习能够实现机器人技术和医疗保健等广泛现实世界应用的自动化。本文对演示学习进行了综述,首先正式介绍了演示问题及其主要挑战,并全面概述了从演示数据集创建到演示学习方法,再到通过将演示学习与不同机器学习方法相结合进行优化的整个学习流程。我们还回顾了现有基准测试,指出了它们的优势与局限性。此外,我们讨论了该范式的优缺点及其主要应用。最后,我们针对这一快速发展的领域,阐述了关于开放性问题与研究方向的看法。