In Programming by Demonstration, the robot learns novel skills from human demonstrations. After learning, the robot should be able not only to reproduce the skill, but also to generalize it to shifted domains without collecting new training data. Adaptation to similar domains has been investigated in the literature; however, an open problem is how to adapt learned skills to different conditions that are outside of the data distribution, and, more important, how to preserve the precision of the desired adaptations. This paper presents a novel supervised learning framework called Constrained Equation Learner Networks that addresses the trajectory adaptation problem in Programming by Demonstrations from a constrained regression perspective. While conventional approaches for constrained regression use one kind of basis function, e.g., Gaussian, we exploit Equation Learner Networks to learn a set of analytical expressions and use them as basis functions. These basis functions are learned from demonstration with the objective to minimize deviations from the training data while imposing constraints that represent the desired adaptations, like new initial or final points or maintaining the trajectory within given bounds. Our approach addresses three main difficulties in adapting robotic trajectories: 1) minimizing the distortion of the trajectory for new adaptations; 2) preserving the precision of the adaptations; and 3) dealing with the lack of intuition about the structure of basis functions. We validate our approach both in simulation and in real experiments in a set of robotic tasks that require adaptation due to changes in the environment, and we compare obtained results with two existing approaches. Performed experiments show that Constrained Equation Learner Networks outperform state of the art approaches by increasing generalization and adaptability of robotic skills.
翻译:在示教编程中,机器人通过人类示教学习新技能。学习后,机器人不仅应能复现技能,还需在不收集新训练数据的情况下将其泛化至偏移领域。现有文献已研究了向相似领域的适应问题,但如何将习得技能适应至数据分布之外的不同条件,以及更重要的是如何保持期望适应的精度,仍是开放性问题。本文提出一种名为约束方程学习网络的新型监督学习框架,从约束回归视角解决示教编程中的轨迹适应问题。传统约束回归方法使用单一基函数(如高斯函数),而我们利用方程学习网络学习一组解析表达式并将其作为基函数。这些基函数通过示教数据学习,目标是最小化与训练数据的偏差,同时施加代表期望适应的约束条件(如新的起始/终止点或保持轨迹在规定范围内)。本方法解决了机器人轨迹适应的三大难点:1)最小化新适应引起的轨迹畸变;2)保持适应精度;3)解决基函数结构缺乏直观性的问题。我们通过仿真和真实实验,在多个因环境变化需要适应的机器人任务中验证了本方法,并将结果与两种现有方法进行了比较。实验表明,约束方程学习网络通过提升机器人技能的泛化性和适应性,超越了现有技术水平。