Control tuning and adaptation present a significant challenge to the usage of robots in diverse environments. It is often nontrivial to find a single set of control parameters by hand that work well across the broad array of environments and conditions that a robot might encounter. Automated adaptation approaches must utilize prior knowledge about the system while adapting to significant domain shifts to find new control parameters quickly. In this work, we present a general framework for online controller adaptation that deals with these challenges. We combine meta-learning with Bayesian recursive estimation to learn prior predictive models of system performance that quickly adapt to online data, even when there is significant domain shift. These predictive models can be used as cost functions within efficient sampling-based optimization routines to find new control parameters online that maximize system performance. Our framework is powerful and flexible enough to adapt controllers for four diverse systems: a simulated race car, a simulated quadrupedal robot, and a simulated and physical quadrotor. The video and code can be found at https://hersh500.github.io/occam.
翻译:控制器的调优与自适应是机器人在多样化环境中应用面临的重要挑战。手动寻找单一控制参数集使其在机器人可能遇到的各种环境与条件下均表现良好,通常并非易事。自动化自适应方法必须在适应显著领域偏移的同时,利用系统先验知识以快速找到新的控制参数。本研究提出一种应对这些挑战的在线控制器自适应通用框架。我们将元学习与贝叶斯递归估计相结合,学习系统性能的先验预测模型,该模型即使面对显著领域偏移也能快速适应在线数据。这些预测模型可作为高效基于采样的优化例程中的代价函数,用于在线寻找能最大化系统性能的新控制参数。本框架具备足够的强大性与灵活性,成功实现了四个异构系统的控制器自适应:仿真赛车、仿真四足机器人,以及仿真与实体四旋翼飞行器。视频与代码详见 https://hersh500.github.io/occam。