Recommender systems are the cornerstone of today's information dissemination, yet a disconnect between offline metrics and online performance greatly hinders their development. Addressing this challenge, we envision a recommendation simulator, capitalizing on recent breakthroughs in human-level intelligence exhibited by Large Language Models (LLMs). We propose Agent4Rec, a user simulator in recommendation, leveraging LLM-empowered generative agents equipped with user profile, memory, and actions modules specifically tailored for the recommender system. In particular, these agents' profile modules are initialized using real-world datasets (e.g. MovieLens, Steam, Amazon-Book), capturing users' unique tastes and social traits; memory modules log both factual and emotional memories and are integrated with an emotion-driven reflection mechanism; action modules support a wide variety of behaviors, spanning both taste-driven and emotion-driven actions. Each agent interacts with personalized recommender models in a page-by-page manner, relying on a pre-implemented collaborative filtering-based recommendation algorithm. We delve into both the capabilities and limitations of Agent4Rec, aiming to explore an essential research question: ``To what extent can LLM-empowered generative agents faithfully simulate the behavior of real, autonomous humans in recommender systems?'' Extensive and multi-faceted evaluations of Agent4Rec highlight both the alignment and deviation between agents and user-personalized preferences. Beyond mere performance comparison, we explore insightful experiments, such as emulating the filter bubble effect and discovering the underlying causal relationships in recommendation tasks. Our codes are available at https://github.com/LehengTHU/Agent4Rec.
翻译:推荐系统是当今信息传播的基石,然而离线指标与在线性能之间的脱节严重阻碍了其发展。为应对这一挑战,我们构想了一个推荐模拟器,其利用了大型语言模型(LLMs)所展现出的类人智能方面的近期突破。我们提出了Agent4Rec,一个用于推荐系统的用户模拟器,它利用了由LLM赋能的生成式智能体,这些智能体配备了专门为推荐系统定制的用户画像、记忆和行为模块。具体而言,这些智能体的画像模块使用真实世界数据集(如MovieLens、Steam、Amazon-Book)进行初始化,以捕捉用户的独特品味和社交特征;记忆模块记录事实记忆和情感记忆,并与情感驱动的反思机制相结合;行为模块支持广泛的行为,涵盖品味驱动和情感驱动的行动。每个智能体以逐页方式与个性化推荐模型进行交互,依赖于一个预先实现的基于协同过滤的推荐算法。我们深入探讨了Agent4Rec的能力与局限性,旨在探索一个核心研究问题:“由LLM赋能的生成式智能体能在多大程度上忠实模拟推荐系统中真实、自主的人类行为?”对Agent4Rec进行的广泛且多方面的评估,揭示了智能体与用户个性化偏好之间的对齐与偏差。除了单纯的性能比较,我们还探索了富有启发性的实验,例如模拟信息茧房效应以及发现推荐任务中潜在的因果关系。我们的代码可在 https://github.com/LehengTHU/Agent4Rec 获取。