Skateboards offer a compact and efficient means of transportation as a type of personal mobility device. However, controlling them with legged robots poses several challenges for policy learning due to perception-driven interactions and multi-modal control objectives across distinct skateboarding phases. To address these challenges, we introduce Phase-Aware Policy Learning (PAPL), a reinforcement-learning framework tailored for skateboarding with quadruped robots. PAPL leverages the cyclic nature of skateboarding by integrating phase-conditioned Feature-wise Linear Modulation layers into actor and critic networks, enabling a unified policy that captures phase-dependent behaviors while sharing robot-specific knowledge across phases. Our evaluations in simulation validate command-tracking accuracy and conduct ablation studies quantifying each component's contribution. We also compare locomotion efficiency against leg and wheel-leg baselines and show real-world transferability.
翻译:滑板作为一种个人移动设备,提供了紧凑高效的交通方式。然而,由于感知驱动的交互作用以及不同滑板阶段的多模态控制目标,利用腿式机器人控制滑板给策略学习带来了若干挑战。为解决这些挑战,我们提出了相位感知策略学习(PAPL),这是一个专为四足机器人滑板运动设计的强化学习框架。PAPL利用滑板运动的周期性特征,通过将相位条件特征线性调制层集成到执行者与评价者网络中,实现了一个统一的策略,该策略能够捕捉相位依赖行为,同时在不同阶段共享机器人特定知识。我们在仿真环境中的评估验证了指令跟踪精度,并通过消融研究量化了各组成部分的贡献。此外,我们还与腿式及轮腿混合基线模型进行了运动效率比较,并展示了该框架在现实世界中的可迁移性。