LLM-powered embodied agents have shown success on conventional object-rearrangement tasks, but providing personalized assistance that leverages user-specific knowledge from past interactions presents new challenges. We investigate these challenges through the lens of agents' memory utilization along two critical dimensions: object semantics (identifying objects based on personal meaning) and user patterns (recalling sequences from behavioral routines). To assess these capabilities, we construct MEMENTO, an end-to-end two-stage evaluation framework comprising single-memory and joint-memory tasks. Our experiments reveal that current agents can recall simple object semantics but struggle to apply sequential user patterns to planning. Through in-depth analysis, we identify two critical bottlenecks: information overload and coordination failures when handling multiple memories. Based on these findings, we explore memory architectural approaches to address these challenges. Given our observation that episodic memory provides both personalized knowledge and in-context learning benefits, we design a hierarchical knowledge graph-based user-profile memory module that separately manages personalized knowledge, achieving substantial improvements on both single and joint-memory tasks. Project website: https://connoriginal.github.io/MEMENTO
翻译:基于大语言模型的具身智能体在传统物体重排任务中已展现出成功,但提供能够利用过往交互中用户特定知识的个性化辅助仍面临新挑战。我们通过智能体记忆利用的两个关键维度——物体语义(基于个人意义识别物体)和用户模式(从行为习惯中回忆序列)——来探究这些挑战。为评估这些能力,我们构建了MEMENTO,一个端到端的两阶段评估框架,包含单记忆任务与联合记忆任务。实验表明,当前智能体能够回忆简单的物体语义,但难以将序列化用户模式应用于规划。通过深入分析,我们识别出两个关键瓶颈:处理多重记忆时的信息过载与协调失效。基于这些发现,我们探索了应对这些挑战的记忆架构方法。鉴于我们观察到情景记忆既能提供个性化知识又具有情境学习优势,我们设计了一种基于分层知识图谱的用户画像记忆模块,该模块可分别管理个性化知识,在单记忆与联合记忆任务上均实现了显著性能提升。项目网站:https://connoriginal.github.io/MEMENTO