The rapid expansion of foundation pre-trained models and their fine-tuned counterparts has significantly contributed to the advancement of machine learning. Leveraging pre-trained models to extract knowledge and expedite learning in real-world tasks, known as "Model Reuse", has become crucial in various applications. Previous research focuses on reusing models within a certain aspect, including reusing model weights, structures, and hypothesis spaces. This paper introduces ZhiJian, a comprehensive and user-friendly toolbox for model reuse, utilizing the PyTorch backend. ZhiJian presents a novel paradigm that unifies diverse perspectives on model reuse, encompassing target architecture construction with PTM, tuning target model with PTM, and PTM-based inference. This empowers deep learning practitioners to explore downstream tasks and identify the complementary advantages among different methods. ZhiJian is readily accessible at https://github.com/zhangyikaii/lamda-zhijian facilitating seamless utilization of pre-trained models and streamlining the model reuse process for researchers and developers.
翻译:基础预训练模型及其微调版本的快速扩张极大地推动了机器学习的发展。利用预训练模型提取知识并加速实际任务中的学习过程(即"模型复用")在各类应用中变得至关重要。以往研究侧重于模型复用的特定方面,包括复用模型权重、结构和假设空间。本文介绍ZhiJian——一个基于PyTorch后端、全面且用户友好的模型复用工具包。ZhiJian提出了一种新颖范式,统一了模型复用的多种视角,涵盖基于预训练模型的目标架构构建、目标模型调优及推理。这使得深度学习从业者能够探索下游任务,并发现不同方法间的互补优势。ZhiJian可通过https://github.com/zhangyikaii/lamda-zhijian 直接访问,便于研究人员和开发者无缝利用预训练模型,简化模型复用流程。