We study how to fine-tune LLMs using user-edit deployment data consisting of a set of context, an agent's response, and user edits. This deployment data is naturally generated by users in applications such as LLMs-based writing assistants and coding agents. The _natural_ origin of user edits makes it a desired source for adapting and personalizing LLMs. In this setup, there emerges a unification of various feedback types namely preferences, supervised labels, and cost that are typically studied separately in the literature. In this paper, we initiate the theoretical investigation of learning from user edits. We first derive bounds for learning algorithms that learn from each of these feedback types. We prove that these algorithms have different trade-offs depending upon the user, data distribution, and model class. We then propose a simple ensembling procedure to jointly learn from these feedback types. On two domains adapted from Gao et al. 2024, we show our ensembling procedure outperforms these methods that learn from individual feedback. Further, we show that our proposed procedure can robustly adapt to different user-edit distributions at test time.
翻译:本研究探讨如何利用用户编辑部署数据对大语言模型进行微调,该数据包含上下文、智能体响应及用户编辑三要素。此类部署数据自然产生于基于大语言模型的写作助手与代码智能体等应用场景中。用户编辑的_自然_属性使其成为适配与个性化大语言模型的理想数据源。在此框架下,文献中通常独立研究的偏好反馈、监督标签与代价反馈等多元反馈类型实现了理论统一。本文首次对基于用户编辑的学习机制展开理论探究。首先,我们推导了从各类反馈中学习算法的性能边界,证明这些算法根据用户特征、数据分布及模型类别的不同具有差异化权衡特性。随后提出一种简单的集成方法以实现多类型反馈的联合学习。在基于Gao等人2024年工作改造的两个实验领域中,我们证明该集成方法优于单一反馈学习方法。进一步研究表明,所提方法能在测试阶段鲁棒地适应不同的用户编辑分布。