Large language models (LLMs) excel at general next-token prediction but still struggle to generate responses that reflect how individuals truly communicate, such as replying to emails or social messages in their own style. However, real SNS or email histories are difficult to collect due to privacy concerns. To address this, we propose the task of "Your Next Token Prediction (YNTP)", which models a user's precise word choices through controlled human-agent conversations. We build a multilingual benchmark of 100 dialogue sessions across English, Japanese, and Chinese, where users interact for five days with psychologically grounded NPCs based on MBTI dimensions. This setup captures natural, daily-life communication patterns and enables analysis of users' internal models. We evaluate prompt-based and fine-tuning-based personalization methods, establishing the first benchmark for YNTP and a foundation for user-aligned language modeling. The dataset is available at: https://github.com/AnonymousHub4Submissions/your-next-token-prediction-dataset-100
翻译:大型语言模型(LLM)在通用下一个词元预测方面表现出色,但在生成反映个体真实交流方式的响应(例如以个人风格回复电子邮件或社交消息)方面仍存在困难。然而,由于隐私问题,真实的社交网络服务(SNS)或电子邮件历史记录难以收集。为此,我们提出了“您的下一个词元预测(YNTP)”任务,该任务通过受控的人机对话对用户的精确用词选择进行建模。我们构建了一个包含英语、日语和中文共100个对话会话的多语言基准数据集,其中用户基于MBTI维度与具有心理学基础的NPC进行为期五天的互动。这种设置捕捉了自然的日常生活交流模式,并支持对用户内部模型的分析。我们评估了基于提示和基于微调的个性化方法,为YNTP建立了首个基准,并为用户对齐的语言建模奠定了基础。数据集可通过以下链接获取:https://github.com/AnonymousHub4Submissions/your-next-token-prediction-dataset-100