In the robot follow-ahead task, a mobile robot is tasked to maintain its relative position in front of a moving human actor while keeping the actor in sight. To accomplish this task, it is important that the robot understand the full 3D pose of the human (since the head orientation can be different than the torso) and predict future human poses so as to plan accordingly. This prediction task is especially tricky in a complex environment with junctions and multiple corridors. In this work, we address the problem of forecasting the full 3D trajectory of a human in such environments. Our main insight is to show that one can first predict the 2D trajectory and then estimate the full 3D trajectory by conditioning the estimator on the predicted 2D trajectory. With this approach, we achieve results comparable or better than the state-of-the-art methods three times faster. As part of our contribution, we present a new dataset where, in contrast to existing datasets, the human motion is in a much larger area than a single room. We also present a complete robot system that integrates our human pose forecasting network on the mobile robot to enable real-time robot follow-ahead and present results from real-world experiments in multiple buildings on campus. Our project page, including supplementary material and videos, can be found at: https://qingyuan-jiang.github.io/iros2024_poseForecasting/
翻译:在机器人前方跟随任务中,移动机器人需保持相对于行进中人类演员前方位置的同时,确保演员始终处于视野范围内。完成此任务的关键在于机器人需理解人体的完整三维姿态(因头部朝向可能与躯干不同),并预测其未来姿态以进行相应规划。在具有交叉路口及多条走廊的复杂环境中,该预测任务尤为困难。本文针对此类环境下人体完整三维轨迹预测问题展开研究。我们核心思路在于:可先预测二维轨迹,再通过将该二维轨迹作为条件输入估计器,进而估算完整三维轨迹。采用该方法,我们取得了与现有最优方法相当或更优的结果,且计算速度提升三倍。作为研究贡献之一,我们提出一个全新数据集——与现有数据集相比,该数据集中人体运动区域远超单一房间范围。此外,我们展示了完整机器人系统:通过将人体姿态预测网络集成至移动机器人,实现实时前方跟随功能,并呈现了在校园内多栋建筑中的真实环境实验结果。项目页面(含补充材料及视频)请见:https://qingyuan-jiang.github.io/iros2024_poseForecasting/