基于因子化子目标扩散的分层实体中心强化学习 (Hierarchical Entity-centric Reinforcement Learning with Factored Subgoal Diffusion)

We propose a hierarchical entity-centric framework for offline Goal-Conditioned Reinforcement Learning (GCRL) that combines subgoal decomposition with factored structure to solve long-horizon tasks in domains with multiple entities. Achieving long-horizon goals in complex environments remains a core challenge in Reinforcement Learning (RL). Domains with multiple entities are particularly difficult due to their combinatorial complexity. GCRL facilitates generalization across goals and the use of subgoal structure, but struggles with high-dimensional observations and combinatorial state-spaces, especially under sparse reward. We employ a two-level hierarchy composed of a value-based GCRL agent and a factored subgoal-generating conditional diffusion model. The RL agent and subgoal generator are trained independently and composed post hoc through selective subgoal generation based on the value function, making the approach modular and compatible with existing GCRL algorithms. We introduce new variations to benchmark tasks that highlight the challenges of multi-entity domains, and show that our method consistently boosts performance of the underlying RL agent on image-based long-horizon tasks with sparse rewards, achieving over 150% higher success rates on the hardest task in our suite and generalizing to increasing horizons and numbers of entities. Rollout videos are provided at: https://sites.google.com/view/hecrl

翻译：我们提出了一种面向离线目标条件强化学习的分层实体中心框架，该框架将子目标分解与因子化结构相结合，以解决多实体领域中的长时程任务。在复杂环境中实现长时程目标仍然是强化学习的核心挑战。多实体领域因其组合复杂性而尤为困难。目标条件强化学习虽能促进跨目标泛化并利用子目标结构，但在高维观测和组合状态空间（尤其是稀疏奖励条件下）仍面临困难。我们采用了一个双层架构，包含基于价值的目标条件强化学习智能体和一个因子化子目标生成条件扩散模型。强化学习智能体与子目标生成器独立训练，并通过基于价值函数的选择性子目标生成进行事后组合，使该方法具有模块化特性且兼容现有目标条件强化学习算法。我们引入了基准任务的新变体以突显多实体领域的挑战，并证明我们的方法在基于图像、具有稀疏奖励的长时程任务中，能持续提升底层强化学习智能体的性能——在测试集中最困难任务上实现了超过150%的成功率提升，并能泛化至更长的时程和更多的实体数量。任务演示视频详见：https://sites.google.com/view/hecrl

相关内容

实体

关注 12

实体（entity）是有可区别性且独立存在的某种事物，但它不需要是物质上的存在。尤其是抽象和法律拟制也通常被视为实体。实体可被看成是一包含有子集的集合。在哲学里，这种集合被称为客体。实体可被使用来指涉某个可能是人、动物、植物或真菌等不会思考的生命、无生命物体或信念等的事物。在这一方面，实体可以被视为一全包的词语。有时，实体被当做本质的广义，不论即指的是否为物质上的存在，如时常会指涉到的无物质形式的实体－语言。更有甚者，实体有时亦指存在或本质本身。在法律上，实体是指能具有权利和义务的事物。这通常是指法人，但也包括自然人。

《基于分层多智能体强化学习的逼真空战协同策略》

专知会员服务

39+阅读 · 2025年10月30日

《基于分层多智能体强化学习的空战战术优化研究》最新31页

专知会员服务

47+阅读 · 2025年5月15日

基于学习机制的多智能体强化学习综述

专知会员服务

61+阅读 · 2024年4月16日

分层强化学习在无人机领域应用综述

专知会员服务

53+阅读 · 2024年3月19日