Knowledge-Augmented Large Language Models for Personalized Contextual Query Suggestion

Large Language Models (LLMs) excel at tackling various natural language tasks. However, due to the significant costs involved in re-training or fine-tuning them, they remain largely static and difficult to personalize. Nevertheless, a variety of applications could benefit from generations that are tailored to users' preferences, goals, and knowledge. Among them is web search, where knowing what a user is trying to accomplish, what they care about, and what they know can lead to improved search experiences. In this work, we propose a novel and general approach that augments an LLM with relevant context from users' interaction histories with a search engine in order to personalize its outputs. Specifically, we construct an entity-centric knowledge store for each user based on their search and browsing activities on the web, which is then leveraged to provide contextually relevant LLM prompt augmentations. This knowledge store is light-weight, since it only produces user-specific aggregate projections of interests and knowledge onto public knowledge graphs, and leverages existing search log infrastructure, thereby mitigating the privacy, compliance, and scalability concerns associated with building deep user profiles for personalization. We validate our approach on the task of contextual query suggestion, which requires understanding not only the user's current search context but also what they historically know and care about. Through a number of experiments based on human evaluation, we show that our approach is significantly better than several other LLM-powered baselines, generating query suggestions that are contextually more relevant, personalized, and useful.

翻译：大语言模型（LLMs）在处理各类自然语言任务中表现出色。然而，由于重新训练或微调这些模型涉及高昂成本，它们基本保持静态且难以实现个性化。尽管如此，许多应用仍能从符合用户偏好、目标和知识的生成结果中获益，其中便包括网络搜索——了解用户试图完成的任务、关注点及已有知识，可显著提升搜索体验。本研究提出一种新颖且通用的方法，通过将搜索引擎中用户交互历史的相关上下文注入大语言模型，实现输出个性化。具体而言，我们基于用户在网页上的搜索与浏览活动，为每位用户构建以实体为中心的知识存储，进而生成上下文相关的大语言模型提示增强。该知识存储轻量化，仅将用户特定兴趣与知识聚合投射至公共知识图谱，并利用现有搜索日志基础设施，从而缓解了为个性化构建深度用户档案所带来的隐私、合规及可扩展性问题。我们在上下文查询建议任务中验证了该方法——该任务不仅需理解用户当前的搜索上下文，还需掌握其历史知识领域与兴趣偏好。基于人工评估的系列实验表明，我们的方法显著优于多个大语言模型驱动的基线系统，生成的查询建议在上下文相关性、个性化程度及实用性方面均更具优势。

相关内容

大语言模型

关注 66

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日