Large Language Models (LLMs) have quickly become an invaluable assistant for a variety of tasks. However, their effectiveness is constrained by their ability to tailor responses to human preferences and behaviors via personalization. Prior work in LLM personalization has largely focused on style transfer or incorporating small factoids about the user, as knowledge injection remains an open challenge. In this paper, we explore injecting knowledge of prior conversations into LLMs to enable future work on less redundant, personalized conversations. We identify two real-world constraints: (1) conversations are sequential in time and must be treated as such during training, and (2) per-user personalization is only viable in parameter-efficient settings. To this aim, we propose PLUM, a pipeline performing data augmentation for up-sampling conversations as question-answer pairs, that are then used to finetune a low-rank adaptation adapter with a weighted cross entropy loss. Even in this first exploration of the problem, we perform competitively with baselines such as RAG, attaining an accuracy of 81.5% across 100 conversations.
翻译:大型语言模型(LLM)已迅速成为处理各类任务不可或缺的助手。然而,其实际效能受限于通过个性化机制适应人类偏好与行为的能力。现有LLM个性化研究主要集中于风格迁移或融入少量用户事实信息,因为知识注入仍是一个开放挑战。本文探索将历史对话知识注入LLM的方法,为未来实现冗余度更低、个性化程度更高的对话系统奠定基础。我们识别出两项现实约束:(1)对话具有时间序列特性,训练过程需保持其时序结构;(2)仅参数高效配置方案能实现可行的单用户个性化。为此,我们提出PLUM流程:通过数据增强技术将对话上采样为问答对,继而采用加权交叉熵损失对低秩自适应适配器进行微调。即使在此问题的首次探索中,我们的方法仍与RAG等基线模型表现相当,在100组对话测试中达到81.5%的准确率。