Advancements in generative AI have broadened the potential applications of Large Language Models (LLMs) in the development of autonomous agents. Achieving true autonomy requires accumulating and updating knowledge gained from interactions with the environment and effectively utilizing it. Current LLM-based approaches leverage past experiences using a full history of observations, summarization or retrieval augmentation. However, these unstructured memory representations do not facilitate the reasoning and planning essential for complex decision-making. In our study, we introduce AriGraph, a novel method wherein the agent constructs a memory graph that integrates semantic and episodic memories while exploring the environment. This graph structure facilitates efficient associative retrieval of interconnected concepts, relevant to the agent's current state and goals, thus serving as an effective environmental model that enhances the agent's exploratory and planning capabilities. We demonstrate that our Ariadne LLM agent, equipped with this proposed memory architecture augmented with planning and decision-making, effectively handles complex tasks on a zero-shot basis in the TextWorld environment. Our approach markedly outperforms established methods such as full-history, summarization, and Retrieval-Augmented Generation in various tasks, including the cooking challenge from the First TextWorld Problems competition and novel tasks like house cleaning and puzzle Treasure Hunting.
翻译:生成式人工智能的进步拓宽了大型语言模型(LLMs)在自主智能体开发中的潜在应用。实现真正的自主性需要积累并更新从环境交互中获得的知识,并有效利用这些知识。当前基于LLM的方法通过完整观察历史、摘要或检索增强来利用过往经验。然而,这些非结构化的记忆表征不利于复杂决策所必需的推理与规划。在本研究中,我们提出了AriGraph,一种新颖的方法,其中智能体在探索环境时构建一个融合语义记忆与情景记忆的记忆图谱。这种图结构促进了与智能体当前状态和目标相关的互连概念的高效关联检索,从而作为一个有效的环境模型,增强了智能体的探索与规划能力。我们证明,配备此增强了规划与决策能力的记忆架构的Ariadne LLM智能体,能够在TextWorld环境中以零样本方式有效处理复杂任务。我们的方法在多项任务中显著优于现有方法,如完整历史记录、摘要和检索增强生成,这些任务包括First TextWorld Problems竞赛中的烹饪挑战以及房屋清洁和谜题寻宝等新颖任务。