Recent work in reinforcement learning has shown that incorporating structural priors for articulated robots, such as link connectivity, into policy networks improves learning efficiency. However, dynamics properties, despite their fundamental role in determining how forces and motion propagate through the body, remain largely underexplored as an inductive bias for policy learning. To address this gap, we present the Articulated-Body Dynamics Network (ABD-Net), a novel graph neural network architecture grounded in the computational structure of forward dynamics. Specifically, we adapt the inertia propagation mechanism from the Articulated Body Algorithm, systematically aggregating inertial quantities from child to parent links in a tree-structured manner, while replacing physical quantities with learnable parameters. Embedding ABD-NET into the policy actor enables dynamics-informed representations that capture how actions propagate through the body, leading to efficient and robust policy learning. Through experiments with simulated humanoid, quadruped, and hopper robots, our approach demonstrates increased sample efficiency and generalization to dynamics shifts compared to transformer-based and GNN baselines. We further validate the learned policy on real Unitree G1 and Go2 robots, state-of-the-art humanoid and quadruped platforms, generating dynamic, versatile and robust locomotion behaviors through sim-to-real transfer with real-time inference.
翻译:近年来的强化学习研究表明,将铰接机器人结构先验(如连杆连接关系)融入策略网络可提升学习效率。然而,动力学特性作为决定力与运动在机体中传播方式的核心要素,尚未被充分探索为策略学习的归纳偏置。为填补这一空白,我们提出铰接体动力学网络(ABD-Net),这是一种基于前向动力学计算结构的新型图神经网络架构。具体而言,我们从铰接体算法中适配惯性传播机制,以树状结构将惯性量从子连杆系统地聚合至父连杆,同时用可学习参数替代物理量。将ABD-NET嵌入策略执行器后,可获得蕴含动力学信息的表征,捕捉动作在机体中的传播规律,从而实现高效鲁棒的策略学习。通过在仿真人形机器人、四足机器人和跳跃机器人上的实验,相较基于Transformer和GNN的基线方法,我们的方法展现出更高的样本效率和动力学变化泛化能力。我们进一步在真实Unitree G1与Go2机器人(代表前沿水平的人形与四足平台)上验证所学策略,通过仿真到现实的迁移与实时推理,生成动态、多样且鲁棒的行走行为。