This study presents an emotion-aware navigation framework -- EmoBipedNav -- using deep reinforcement learning (DRL) for bipedal robots walking in socially interactive environments. The inherent locomotion constraints of bipedal robots challenge their safe maneuvering capabilities in dynamic environments. When combined with the intricacies of social environments, including pedestrian interactions and social cues, such as emotions, these challenges become even more pronounced. To address these coupled problems, we propose a two-stage pipeline that considers both bipedal locomotion constraints and complex social environments. Specifically, social navigation scenarios are represented using sequential LiDAR grid maps (LGMs), from which we extract latent features, including collision regions, emotion-related discomfort zones, social interactions, and the spatio-temporal dynamics of evolving environments. The extracted features are directly mapped to the actions of reduced-order models (ROMs) through a DRL architecture. Furthermore, the proposed framework incorporates full-order dynamics and locomotion constraints during training, effectively accounting for tracking errors and restrictions of the locomotion controller while planning the trajectory with ROMs. Comprehensive experiments demonstrate that our approach exceeds both model-based planners and DRL-based baselines. The hardware videos and open-source code are available at https://gatech-lidar.github.io/emobipednav.github.io/.
翻译:本研究提出了一种情感感知导航框架——EmoBipedNav,该框架利用深度强化学习(DRL)使双足机器人能够在社交互动环境中行走。双足机器人固有的运动约束对其在动态环境中的安全机动能力提出了挑战。当这些挑战与社交环境的复杂性(包括行人互动和情感等社交线索)相结合时,问题变得更加突出。为解决这些耦合问题,我们提出了一个两阶段流程,同时考虑了双足运动约束和复杂的社交环境。具体而言,我们使用序列化激光雷达网格地图(LGM)来表示社交导航场景,并从中提取潜在特征,包括碰撞区域、情感相关不适区、社交互动以及演化环境的时空动态。提取的特征通过DRL架构直接映射到降阶模型(ROM)的动作上。此外,所提出的框架在训练过程中结合了全阶动力学和运动约束,从而在利用ROM规划轨迹时,有效考虑了运动控制器的跟踪误差和限制。综合实验表明,我们的方法超越了基于模型的规划器和基于DRL的基线方法。硬件演示视频和开源代码可在 https://gatech-lidar.github.io/emobipednav.github.io/ 获取。