To succeed in the real world, robots must cope with situations that differ from those seen during training. We study the problem of adapting on-the-fly to such novel scenarios during deployment, by drawing upon a diverse repertoire of previouslylearned behaviors. Our approach, RObust Autonomous Modulation (ROAM), introduces a mechanism based on the perceived value of pre-trained behaviors to select and adapt pre-trained behaviors to the situation at hand. Crucially, this adaptation process all happens within a single episode at test time, without any human supervision. We demonstrate that ROAM enables a robot to adapt rapidly to changes in dynamics both in simulation and on a real Go1 quadruped, even successfully moving forward with roller skates on its feet. Our approach adapts over 2x as efficiently compared to existing methods when facing a variety of out-of-distribution situations during deployment by effectively choosing and adapting relevant behaviors on-the-fly.
翻译:为使机器人在现实世界中取得成功,其必须能够处理与训练时所见情境不同的情况。本研究探讨了在部署过程中即时适应此类新场景的问题,方法是利用先前习得的多样化行为库。我们提出的鲁棒自主调制(ROAM)方法引入了一种基于预训练行为感知价值的机制,以根据当前情境选择并调整预训练行为。关键在于,这种适应过程完全在测试阶段的单次运行中完成,无需任何人工监督。我们通过仿真和真实Go1四足机器人的实验证明,ROAM能使机器人快速适应动态变化,甚至成功实现脚穿轮滑鞋的前进运动。在部署过程中面对多种分布外情境时,我们的方法通过即时有效选择并调整相关行为,其适应效率达到现有方法的2倍以上。