We present an algorithm for skill discovery from expert demonstrations. The algorithm first utilizes Large Language Models (LLMs) to propose an initial segmentation of the trajectories. Following that, a hierarchical variational inference framework incorporates the LLM-generated segmentation information to discover reusable skills by merging trajectory segments. To further control the trade-off between compression and reusability, we introduce a novel auxiliary objective based on the Minimum Description Length principle that helps guide this skill discovery process. Our results demonstrate that agents equipped with our method are able to discover skills that help accelerate learning and outperform baseline skill learning approaches on new long-horizon tasks in BabyAI, a grid world navigation environment, as well as ALFRED, a household simulation environment.
翻译:我们提出了一种从专家演示中学习技能的算法。该算法首先利用大型语言模型(LLMs)对轨迹进行初始分割。随后,一个分层变分推理框架融合LLM生成的分割信息,通过合并轨迹片段来发现可复用的技能。为了进一步权衡压缩性与复用性,我们基于最小描述长度原理引入了一种新颖的辅助目标函数,以指导技能发现过程。实验结果表明,采用本方法的智能体能够发现有助于加速学习的技能,并在BabyAI(网格世界导航环境)和ALFRED(家庭模拟环境)的新增长周期任务上,其性能优于基线技能学习方法。