Large language models (LLMs) have demonstrated impressive capabilities in role-playing tasks. However, there is limited research on whether LLMs can accurately simulate user behavior in real-world scenarios, such as social media. This requires models to effectively analyze a user's history and simulate their role. In this paper, we introduce \textbf{FineRob}, a novel fine-grained behavior simulation dataset. We collect the complete behavioral history of 1,866 distinct users across three social media platforms. Each behavior is decomposed into three fine-grained elements: object, type, and content, resulting in 78.6k QA records. Based on FineRob, we identify two dominant reasoning patterns in LLMs' behavior simulation processes and propose the \textbf{OM-CoT} fine-tuning method to enhance the capability. Through comprehensive experiments, we conduct an in-depth analysis of key factors of behavior simulation and also demonstrate the effectiveness of OM-CoT approach\footnote{Code and dataset are available at \url{https://github.com/linkseed18612254945/FineRob}}
翻译:大语言模型(LLMs)在角色扮演任务中展现出令人印象深刻的能力。然而,关于LLMs能否在真实世界场景(如社交媒体)中准确模拟用户行为的研究仍然有限。这要求模型能够有效分析用户历史并模拟其角色。本文提出\textbf{FineRob},一个新颖的细粒度行为模拟数据集。我们收集了三个社交媒体平台上1,866位独立用户的完整行为历史。每个行为被分解为三个细粒度要素:对象、类型和内容,共生成78.6k条问答记录。基于FineRob,我们识别出LLMs行为模拟过程中的两种主导推理模式,并提出\textbf{OM-CoT}微调方法以增强其能力。通过全面实验,我们对行为模拟的关键因素进行了深入分析,并验证了OM-CoT方法的有效性\footnote{代码与数据集公开于\url{https://github.com/linkseed18612254945/FineRob}}