We study the problem of collision-free humanoid traversal in cluttered indoor scenes, such as hurdling over objects scattered on the floor, crouching under low-hanging obstacles, or squeezing through narrow passages. To achieve this goal, the humanoid needs to map its perception of surrounding obstacles with diverse spatial layouts and geometries to the corresponding traversal skills. However, the lack of an effective representation that captures humanoid-obstacle relationships during collision avoidance makes directly learning such mappings difficult. We therefore propose Humanoid Potential Field (HumanoidPF), which encodes these relationships as collision-free motion directions, significantly facilitating RL-based traversal skill learning. We also find that HumanoidPF exhibits a surprisingly negligible sim-to-real gap as a perceptual representation. To further enable generalizable traversal skills through diverse and challenging cluttered indoor scenes, we further propose a hybrid scene generation method, incorporating crops of realistic 3D indoor scenes and procedurally synthesized obstacles. We successfully transfer our policy to the real world and develop a teleoperation system where users could command the humanoid to traverse in cluttered indoor scenes with just a single click. Extensive experiments are conducted in both simulation and the real world to validate the effectiveness of our method. Demos and code can be found in our website: https://axian12138.github.io/CAT/.
翻译:本文研究杂乱室内场景中的无碰撞人形机器人穿越问题,例如跨越散落地面的物体、蹲身通过低矮障碍物或侧身挤过狭窄通道。为实现这一目标,人形机器人需将其对具有不同空间布局与几何形态的周围障碍物的感知,映射至相应的穿越技能。然而,由于缺乏能有效表征避碰过程中人形-障碍物关系的表达方式,直接学习此类映射关系较为困难。为此,我们提出人形势场(HumanoidPF),该模型将上述关系编码为无碰撞运动方向,显著促进了基于强化学习的穿越技能学习。我们还发现,作为感知表征,HumanoidPF展现出可忽略不计的仿真-现实差异。为进一步使穿越技能能够泛化至多样且具有挑战性的杂乱室内场景,我们进一步提出一种混合场景生成方法,该方法融合了真实三维室内场景的裁剪片段与程序化合成的障碍物。我们成功将策略迁移至现实世界,并开发了一套遥操作系统,用户仅需单击即可指挥人形机器人在杂乱室内场景中完成穿越。我们在仿真环境与真实世界中进行了大量实验,以验证所提方法的有效性。演示视频与代码可访问我们的网站:https://axian12138.github.io/CAT/。