Humanoid robots, with the potential to perform a broad range of tasks in environments designed for humans, have been deemed crucial for the basis of general AI agents. When talking about planning and controlling, although traditional models and task-specific methods have been extensively studied over the past few decades, they are inadequate for achieving the flexibility and versatility needed for general autonomy. Learning approaches, especially reinforcement learning, are powerful and popular nowadays, but they are inherently "blind" during training, relying heavily on trials in simulation without proper guidance from physical principles or underlying dynamics. In response, we propose a novel end-to-end pipeline that seamlessly integrates perception, planning, and model-based control for humanoid robot walking. We refer to our method as iWalker, which is driven by imperative learning (IL), a self-supervising neuro-symbolic learning framework. This enables the robot to learn from arbitrary unlabeled data, significantly improving its adaptability and generalization capabilities. In experiments, iWalker demonstrates effectiveness in both simulated and real-world environments, representing a significant advancement toward versatile and autonomous humanoid robots.
翻译:人形机器人因其在人类设计环境中执行广泛任务的潜力,被视为通用人工智能代理的重要基础。在规划与控制方面,尽管传统模型和任务专用方法在过去数十年间已得到广泛研究,但其仍难以实现通用自主性所需的灵活性与多功能性。学习方法(尤其是强化学习)在当下具有强大影响力且广受欢迎,但其训练过程本质上是“盲目的”——严重依赖仿真环境中的试错,缺乏物理原理或底层动力学的有效指导。为此,我们提出一种新颖的端到端框架,将感知、规划与基于模型的控制无缝集成于人形机器人行走任务中。该方法称为iWalker,其核心驱动力为指令式学习——一种自监督的神经符号学习框架。该框架使机器人能够从任意未标注数据中学习,显著提升其适应性与泛化能力。实验表明,iWalker在仿真环境与真实场景中均表现出卓越性能,标志着向多功能自主人形机器人迈出了重要一步。