In conversational AI, personalizing dialogues with persona profiles and contextual understanding is essential. Despite large language models' (LLMs) improved response coherence, effective persona integration remains a challenge. In this work, we first study two common approaches for personalizing LLMs: textual prompting and direct fine-tuning. We observed that textual prompting often struggles to yield responses that are similar to the ground truths in datasets, while direct fine-tuning tends to produce repetitive or overly generic replies. To alleviate those issues, we propose \textbf{S}elective \textbf{P}rompt \textbf{T}uning (SPT), which softly prompts LLMs for personalized conversations in a selective way. Concretely, SPT initializes a set of soft prompts and uses a trainable dense retriever to adaptively select suitable soft prompts for LLMs according to different input contexts, where the prompt retriever is dynamically updated through feedback from the LLMs. Additionally, we propose context-prompt contrastive learning and prompt fusion learning to encourage the SPT to enhance the diversity of personalized conversations. Experiments on the CONVAI2 dataset demonstrate that SPT significantly enhances response diversity by up to 90\%, along with improvements in other critical performance indicators. Those results highlight the efficacy of SPT in fostering engaging and personalized dialogue generation. The SPT model code (https://github.com/hqsiswiliam/SPT) is publicly available for further exploration.
翻译:在对话式人工智能中,结合人物画像与上下文理解实现对话个性化至关重要。尽管大语言模型在回复连贯性方面已有显著提升,但如何有效整合人物角色信息仍是一项挑战。本文首先研究了两种常见的个性化大语言模型方法:文本提示与直接微调。我们发现,文本提示方法生成的回复往往难以与数据集中的真实回复相匹配,而直接微调则容易产生重复性或过于通用的回答。为缓解这些问题,本文提出**选择性提示调优**方法,通过软提示的方式实现大语言模型的选择性个性化对话。具体而言,SPT初始化一组软提示,并利用可训练的稠密检索器根据不同的输入上下文自适应地为大语言模型选取合适的软提示,其中提示检索器通过大语言模型的反馈进行动态更新。此外,我们提出上下文-提示对比学习与提示融合学习机制,以增强SPT在个性化对话中的多样性表现。在CONVAI2数据集上的实验表明,SPT能够将回复多样性提升高达90%,同时在其他关键性能指标上也有显著改善。这些结果凸显了SPT在促进具有吸引力且个性化的对话生成方面的有效性。SPT模型代码(https://github.com/hqsiswiliam/SPT)已开源以供进一步研究。