Improve Large Language Model Systems with User Logs

Scaling training data and model parameters has long driven progress in large language models (LLMs), but this paradigm is increasingly constrained by the scarcity of high-quality data and diminishing returns from rising computational costs. As a result, recent work is increasing the focus on continual learning from real-world deployment, where user interaction logs provide a rich source of authentic human feedback and procedural knowledge. However, learning from user logs is challenging due to their unstructured and noisy nature. Vanilla LLM systems often struggle to distinguish useful feedback signals from noisy user behavior, and the disparity between user log collection and model optimization (e.g., the off-policy optimization problem) further strengthens the problem. To this end, we propose UNO (User log-driveN Optimization), a unified framework for improving LLM systems (LLMsys) with user logs. UNO first distills logs into semi-structured rules and preference pairs, then employs query-and-feedback-driven clustering to manage data heterogeneity, and finally quantifies the cognitive gap between the model's prior knowledge and the log data. This assessment guides the LLMsys to adaptively filter out noisy feedback and construct different modules for primary and reflective experiences extracted from user logs, thereby improving future responses. Extensive experiments show that UNO achieves state-of-the-art effectiveness and efficiency, significantly outperforming Retrieval Augmented Generation (RAG) and memory-based baselines. We have open-sourced our code at https://github.com/bebr2/UNO .

翻译：长期以来，扩展训练数据和模型参数推动着大语言模型（LLMs）的进展，但这一范式日益受到高质量数据稀缺和计算成本攀升导致收益递减的制约。因此，近期研究越来越关注从实际部署中进行持续学习，其中用户交互日志提供了丰富的真实人类反馈和程序性知识来源。然而，由于用户日志的非结构化与噪声特性，从中学习颇具挑战。原始的大语言模型系统常难以区分有用反馈信号与噪声用户行为，而用户日志收集与模型优化之间的差异（如离策略优化问题）进一步加剧了这一难题。为此，我们提出UNO（用户日志驱动优化），一个利用用户日志改进大语言模型系统（LLMsys）的统一框架。UNO首先将日志提炼为半结构化规则和偏好对，继而采用查询与反馈驱动的聚类机制管理数据异质性，最后量化模型先验知识与日志数据之间的认知差距。这一评估引导LLMsys自适应过滤噪声反馈，并根据用户日志提取的主要经验与反思经验构建不同模块，从而改善未来响应。大量实验表明，UNO在有效性和效率上均达到当前最优水平，显著优于检索增强生成（RAG）及基于记忆的基线方法。我们已在https://github.com/bebr2/UNO开源代码。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

大语言模型的自改进机制：技术综述与未来展望

专知会员服务

18+阅读 · 4月18日

大语言模型的自提升：技术综述与未来展望

专知会员服务

19+阅读 · 3月29日

《大语言模型的数据合成与增强综述》

专知会员服务

44+阅读 · 2024年10月19日

大语言模型的终身学习综述

专知会员服务

77+阅读 · 2024年6月15日