Large Language Model (LLM)-based agent simulation has emerged as a promising approach to meet the increasing demand for real-time and rigorous evaluation in modern recommender systems. A typical LLM-driven simulation framework comprises three essential components: the profile module, memory module, and action module. However, existing studies have primarily concentrated on enhancing the memory and action modules, with limited attention to profile generation, which plays a pivotal role in ensuring realistic agent behaviours and aligning simulated interactions with real user dynamics. Moreover, the scarcity of datasets specifically designed for recommendation simulations has led to heavy reliance on manually crafted profiles, significantly limiting the scalability and generalisability of simulation frameworks across different datasets. To address these challenges, this work proposes an Automated Profile Generation Framework for Recommendation Simulation, APG4RecSim, that constructs realistic, coherent, and robust user profiles with minimal supervision. Extensive experiments on three benchmark datasets demonstrate that APG4RecSim achieves the best overall performance on discrimination, ranking, and rating tasks, improving ranking quality by up to 7% in nDCG@10 and reducing rating distribution divergence by 8% in JSD compared to existing profile-generation baselines. Beyond overall performance gains, our results show that profiles generated by APG4RecSim are resilient to popularity- and position-induced biases and maintain stable performance across datasets and different LLMs.
翻译:基于大语言模型(LLM)的智能体模拟已成为满足现代推荐系统对实时化、严格化评估需求的前沿方法。典型的LLM驱动模拟框架包含三个核心模块:画像模块、记忆模块和动作模块。然而,现有研究主要集中于增强记忆与动作模块,对作为确保智能体行为真实性、使模拟交互贴近真实用户动态关键因素的画像生成关注有限。此外,专门用于推荐模拟的数据集匮乏导致严重依赖人工构建画像,显著限制了模拟框架在不同数据集间的可扩展性和泛化能力。针对上述挑战,本文提出面向推荐模拟的自动化画像生成框架APG4RecSim,该框架能以极低监督成本构建真实、连贯且鲁棒的用户画像。在三个基准数据集上的大量实验表明,APG4RecSim在判别、排序和评分任务中均取得最佳整体性能,与现有画像生成基线相比,在nDCG@10指标上提升排序质量达7%,在JSD指标上降低评分分布差异达8%。除整体性能提升外,实验结果表明,APG4RecSim生成的画像对流行度偏差和位置偏差具有鲁棒性,且能在不同数据集与不同LLM间保持稳定性能。