Learn From Model Beyond Fine-Tuning: A Survey

Foundation models (FM) have demonstrated remarkable performance across a wide range of tasks (especially in the fields of natural language processing and computer vision), primarily attributed to their ability to comprehend instructions and access extensive, high-quality data. This not only showcases their current effectiveness but also sets a promising trajectory towards the development of artificial general intelligence. Unfortunately, due to multiple constraints, the raw data of the model used for large model training are often inaccessible, so the use of end-to-end models for downstream tasks has become a new research trend, which we call Learn From Model (LFM) in this article. LFM focuses on the research, modification, and design of FM based on the model interface, so as to better understand the model structure and weights (in a black box environment), and to generalize the model to downstream tasks. The study of LFM techniques can be broadly categorized into five major areas: model tuning, model distillation, model reuse, meta learning and model editing. Each category encompasses a repertoire of methods and strategies that aim to enhance the capabilities and performance of FM. This paper gives a comprehensive review of the current methods based on FM from the perspective of LFM, in order to help readers better understand the current research status and ideas. To conclude, we summarize the survey by highlighting several critical areas for future exploration and addressing open issues that require further attention from the research community. The relevant papers we investigated in this article can be accessed at <https://github.com/ruthless-man/Awesome-Learn-from-Model>.

翻译：基础模型（Foundation Models, FM）已在广泛任务中展现出卓越性能（尤其在自然语言处理和计算机视觉领域），这主要归功于其理解指令的能力及对大规模高质量数据的访问。这不仅彰显了当前的有效性，也为通用人工智能的发展铺就了前景光明的道路。然而，由于多重限制，用于大模型训练的原始数据往往难以获取，因此利用端到端模型处理下游任务成为新的研究趋势，本文将其称为“模型学习”（Learn From Model, LFM）。LFM专注于基于模型接口对FM进行研究、修改与设计，以更深入地理解模型结构与权重（在黑盒环境下），并将模型推广至下游任务。LFM技术的研究可大致分为五大领域：模型微调、模型蒸馏、模型复用、元学习与模型编辑。每个领域均包含一系列旨在增强FM能力与性能的方法与策略。本文从LFM视角出发，对当前基于FM的方法进行了全面综述，旨在帮助读者更好地理解当前研究现状与思路。最后，我们总结该综述，指出未来探索的关键领域，并强调需要研究社区进一步关注的开放性问题。本文所调查的相关论文可在<https://github.com/ruthless-man/Awesome-Learn-from-Model>获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日