Recent advances in large language models (LLMs) have enabled realistic user simulators for developing and evaluating recommender systems (RSs). However, existing LLM-based simulators for RSs face two major limitations: (1) static and single-step prompt-based inference that leads to inaccurate and incomplete user profile construction; (2) unrealistic and single-round recommendation-feedback interaction pattern that fails to capture real-world scenarios. To address these limitations, we propose DGDPO (Diagnostic-Guided Dynamic Profile Optimization), a novel framework that constructs user profile through a dynamic and iterative optimization process to enhance the simulation fidelity. Specifically, DGDPO incorporates two core modules within each optimization loop: firstly, a specialized LLM-based diagnostic module, calibrated through our novel training strategy, accurately identifies specific defects in the user profile. Subsequently, a generalized LLM-based treatment module analyzes the diagnosed defect and generates targeted suggestions to refine the profile. Furthermore, unlike existing LLM-based user simulators that are limited to single-round interactions, we are the first to integrate DGDPO with sequential recommenders, enabling a bidirectional evolution where user profiles and recommendation strategies adapt to each other over multi-round interactions. Extensive experiments conducted on three real-world datasets demonstrate the effectiveness of our proposed framework.
翻译:近年来,大语言模型(LLM)的进展使得构建逼真的用户模拟器成为可能,用于开发和评估推荐系统(RS)。然而,现有的基于LLM的推荐系统模拟器面临两大局限:(1)采用静态、单步的基于提示的推理,导致用户画像构建不准确且不完整;(2)采用不切实际、单轮的推荐-反馈交互模式,无法捕捉真实场景。为应对这些局限,我们提出了DGDPO(诊断引导的动态画像优化),这是一个通过动态迭代优化过程构建用户画像以提升模拟保真度的新型框架。具体而言,DGDPO在每个优化循环中整合了两个核心模块:首先,一个经过我们新颖训练策略校准的、基于LLM的专用诊断模块,能够准确识别用户画像中的具体缺陷。随后,一个基于LLM的通用治疗模块分析诊断出的缺陷,并生成有针对性的建议以优化画像。此外,与现有仅限于单轮交互的基于LLM的用户模拟器不同,我们首次将DGDPO与序列推荐器相结合,实现了用户画像与推荐策略在多轮交互中相互适应的双向演化。在三个真实世界数据集上进行的大量实验证明了我们所提框架的有效性。