Personalizing large language models (LLMs) for individual users has become increasingly important as they are progressively integrated into real-world applications to support users' daily lives. However, existing personalization approaches often fail to distinguish which components of model predictions and training data truly reflect user preferences, leading to superficial personalization alignment. In this paper, we introduce NextQuill, a novel LLM personalization alignment framework grounded in causal preference modeling. We approach personalization from a causal perspective, treating both model predictions and ground-truth data generation as outcomes influenced by user preferences, along with other factors. We define the true preference effect as the causal impact of user history (which reflects preferences) on each token prediction or data generation instance, estimated through causal intervention techniques. Building on this insight, NextQuill introduces two complementary alignment strategies: (1) aligning model-internal causal preference effects on predictions with those reflected in ground-truth data, rather than indiscriminately fitting predictions, and (2) focusing on fitting preference-bearing tokens identified via ground-truth data preference effects, rather than treating all tokens uniformly. By integrating these strategies, NextQuill shifts the alignment process toward learning from causal preference effects, facilitating more effective and personalized adaptation. Experiments across multiple personalization benchmarks demonstrate that NextQuill significantly improves personalization quality, offering a principled, causal foundation for LLM personalization. Our codes are available on https://github.com/juntaoyou/NextQuill.
翻译:随着大语言模型逐步融入实际应用以支持用户的日常需求,针对个体用户进行个性化定制变得愈发重要。然而,现有个性化方法往往难以区分模型预测和训练数据中真正反映用户偏好的成分,导致个性化对齐仅停留在表面。本文提出NextQuill,一种基于因果偏好建模的新型大语言模型个性化对齐框架。我们从因果视角出发处理个性化问题,将模型预测和真实数据生成均视为受用户偏好及其他因素共同影响的结果。通过因果干预技术,我们定义用户历史(反映偏好)对每个标记预测或数据生成实例的因果影响为其真实偏好效应。基于这一洞察,NextQuill引入两种互补的对齐策略:(1)将模型内部对预测结果的因果偏好效应与真实数据中体现的效应进行对齐,而非无差别地拟合预测结果;(2)专注于拟合通过真实数据偏好效应识别的承载偏好的标记,而非对所有标记一视同仁。通过整合这些策略,NextQuill将对齐过程转向从因果偏好效应中学习,从而促进更有效、更个性化的适应。在多个个性化基准上的实验表明,NextQuill显著提升了个性化质量,为大语言模型个性化提供了基于因果原理的坚实基础。我们的代码已开源在 https://github.com/juntaoyou/NextQuill。