In this work, we address the problem of transferring an autonomous driving (AD) module from one domain to another, in particular from simulation to the real world (Sim2Real). We propose a data-efficient method for online and on-the-fly learning based adaptation for parametrizable control architectures such that the target closed-loop performance is optimized under several uncertainty sources such as model mismatches, environment changes and task choice. The novelty of the work resides in leveraging black-box optimization enabled by executable digital twins, with data-driven hyper-parameter tuning through derivative-free methods to directly adapt in real-time the AD module. Our proposed method requires a minimal amount of interaction with the real-world in the randomization and online training phase. Specifically, we validate our approach in real-world experiments and show the ability to transfer and safely tune a nonlinear model predictive controller in less than 10 minutes, eliminating the need of day-long manual tuning and hours-long machine learning training phases. Our results show that the online adapted NMPC directly compensates for disturbances, avoids overtuning in simulation and for one specific task, and it generalizes for less than 15cm of tracking accuracy over a multitude of trajectories, and leads to 83% tracking improvement.
翻译:本文研究了将自动驾驶模块从一个领域迁移至另一领域的问题,特别是从仿真迁移至真实世界。我们提出了一种数据高效方法,用于在线实时学习参数化控制架构的自适应调整,使得在模型失配、环境变化和任务选择等多种不确定性来源下,目标闭环性能得到优化。本文的创新之处在于利用可执行数字孪生实现黑箱优化,通过无导数方法进行数据驱动的超参数在线调整,直接实现自动驾驶模块的实时自适应。所提方法在随机化和在线训练阶段仅需极少的真实世界交互。具体而言,我们通过真实世界实验验证了该方法,能够在10分钟内完成非线性模型预测控制器的迁移与安全调参,消除了长达数天的手动调参和数小时的机器学习训练阶段。结果表明,在线自适应非线性模型预测控制器可直接补偿扰动,避免仿真及特定任务中的过调,在多种轨迹上以低于15厘米的跟踪精度实现泛化,并带来83%的跟踪性能提升。