The recent emergence of Large Language Models (LLMs) has heralded a new era of human-AI interaction. These sophisticated models, exemplified by Chat-GPT and its successors, have exhibited remarkable capabilities in language understanding. However, as these LLMs have undergone exponential growth, a crucial dimension that remains understudied is the personalization of these models. Large foundation models such as GPT-3 etc. focus on creating a universal model that serves a broad range of tasks and users. This approach emphasizes the model's generalization capabilities, treating users as a collective rather than as distinct individuals. While practical for many common applications, this one-size-fits-all approach often fails to address the rich tapestry of human diversity and individual needs. To explore this issue we introduce the PEFT-U Benchmark: a new dataset for building and evaluating NLP models for user personalization. \datasetname{} consists of a series of user-centered tasks containing diverse and individualized expressions where the preferences of users can potentially differ for the same input. Using PEFT-U, we explore the challenge of efficiently personalizing LLMs to accommodate user-specific preferences in the context of diverse user-centered tasks.
翻译:大型语言模型(LLM)的近期兴起标志着人机交互进入了一个新时代。以Chat-GPT及其后续模型为代表的这些复杂模型在语言理解方面展现出卓越能力。然而,随着这些LLM呈指数级增长,一个尚未得到充分研究的关键维度是这些模型的个性化问题。诸如GPT-3等大型基础模型侧重于构建适用于广泛任务和用户的通用模型。这种方法强调模型的泛化能力,将用户视为整体而非独立个体。尽管这种“一刀切”方案对许多常见应用具有实用性,却往往难以应对人类多样性与个体需求的丰富图景。为探究此问题,我们提出了PEFT-U基准:一个用于构建和评估用户个性化NLP模型的新数据集。该数据集包含一系列以用户为中心的任务,涵盖多样化且个性化的表达方式,其中用户对相同输入的偏好可能存在差异。借助PEFT-U基准,我们探索了在多样化用户中心任务背景下,如何高效实现LLM个性化以适应用户特定偏好的挑战。