The light and soft characteristics of Buoyancy Assisted Lightweight Legged Unit (BALLU) robots have a great potential to provide intrinsically safe interactions in environments involving humans, unlike many heavy and rigid robots. However, their unique and sensitive dynamics impose challenges to obtaining robust control policies in the real world. In this work, we demonstrate robust sim-to-real transfer of control policies on the BALLU robots via system identification and our novel residual physics learning method, Environment Mimic (EnvMimic). First, we model the nonlinear dynamics of the actuators by collecting hardware data and optimizing the simulation parameters. Rather than relying on standard supervised learning formulations, we utilize deep reinforcement learning to train an external force policy to match real-world trajectories, which enables us to model residual physics with greater fidelity. We analyze the improved simulation fidelity by comparing the simulation trajectories against the real-world ones. We finally demonstrate that the improved simulator allows us to learn better walking and turning policies that can be successfully deployed on the hardware of BALLU.
翻译:浮力辅助轻量化腿足单元(BALLU)机器人具有轻质柔顺的特点,不同于许多笨重刚性机器人,这使其在涉及人类的环境中具备实现本质安全交互的巨大潜力。然而,其独特且敏感的动力学特性对在真实世界中获得鲁棒控制策略提出了挑战。本研究通过系统辨识和新型残差物理学习方法“环境模仿”(EnvMimic),展示了BALLU机器人控制策略从仿真到现实的稳健迁移。首先,我们通过采集硬件数据并优化仿真参数,对执行器的非线性动力学进行建模。我们摒弃了传统的监督学习范式,转而利用深度强化学习训练外部作用力策略以匹配真实世界轨迹,从而能够以更高保真度对残差物理进行建模。通过将仿真轨迹与真实世界轨迹对比,我们分析了仿真保真度的提升效果。最终证明,经过改进的仿真器使我们能够学习更优的行走和转向策略,并成功部署于BALLU硬件平台。