Embodied agents struggle to generalize to new environments, even when those environments share similar underlying structures to their training settings. Most current approaches to generating these training environments follow an open-loop paradigm, without considering the agent's current performance. While procedural generation methods can produce diverse scenes, diversity without feedback from the agent is inefficient. The generated environments may be trivially easy, providing limited learning signal. To address this, we present a proof-of-concept for closed-loop environment generation that adapts difficulty to the agent's current capabilities. Our system employs a controllable environment representation, extracts fine-grained performance feedback beyond binary success or failure, and implements a closed-loop adaptation mechanism that translates this feedback into environment modifications. This feedback-driven approach generates training environments that more challenging in the ways the agent needs to improve, enabling more efficient learning and better generalization to novel settings.
翻译:具身智能体难以泛化到新环境,即使这些环境与其训练场景具有相似的基础结构。当前大多数生成训练环境的方法遵循开环范式,未考虑智能体当前性能。尽管程序化生成方法能创造多样化场景,但缺乏智能体反馈的多样性生成是低效的。生成的环境可能过于简单,仅能提供有限的学习信号。为此,我们提出一种概念验证的闭环环境生成方法,可根据智能体当前能力自适应调整难度。该系统采用可控环境表征,提取超越二元成功/失败的细粒度性能反馈,并实现将反馈转化为环境修改的闭环适应机制。这种反馈驱动的方法能生成更具挑战性的训练环境——其挑战性恰好针对智能体需要改进的方面,从而实现更高效的学习和更好的新场景泛化能力。