The emergence of large-scale pre-trained models has heightened their application in various downstream tasks, yet deployment is a challenge in environments with limited computational resources. Knowledge distillation has emerged as a solution in such scenarios, whereby knowledge from large teacher models is transferred into smaller student' models, but this is a non-trivial process that traditionally requires technical expertise in AI/ML. To address these challenges, this paper presents InFiConD, a novel framework that leverages visual concepts to implement the knowledge distillation process and enable subsequent no-code fine-tuning of student models. We develop a novel knowledge distillation pipeline based on extracting text-aligned visual concepts from a concept corpus using multimodal models, and construct highly interpretable linear student models based on visual concepts that mimic a teacher model in a response-based manner. InFiConD's interface allows users to interactively fine-tune the student model by manipulating concept influences directly in the user interface. We validate InFiConD via a robust usage scenario and user study. Our findings indicate that InFiConD's human-in-the-loop and visualization-driven approach enables users to effectively create and analyze student models, understand how knowledge is transferred, and efficiently perform fine-tuning operations. We discuss how this work highlights the potential of interactive and visual methods in making knowledge distillation and subsequent no-code fine-tuning more accessible and adaptable to a wider range of users with domain-specific demands.
翻译:大规模预训练模型的出现提升了其在各类下游任务中的应用价值,但在计算资源受限的环境中部署仍面临挑战。知识蒸馏技术为此类场景提供了解决方案——将大型教师模型的知识迁移至小型学生模型中,但这一过程通常需要人工智能/机器学习领域的技术专长,实现并非易事。为应对这些挑战,本文提出InFiConD这一创新框架,该框架利用视觉概念实现知识蒸馏过程,并支持对学生模型进行后续的无代码微调。我们开发了一种基于多模态模型从概念语料库中提取文本对齐视觉概念的新型知识蒸馏流程,并构建了高度可解释的线性学生模型,这些模型基于视觉概念以响应模拟的方式复现教师模型的行为。InFiConD的交互界面允许用户直接在界面中操纵概念影响力来交互式微调学生模型。我们通过典型使用场景和用户研究验证了InFiConD的有效性。研究结果表明,InFiConD采用的人机协同与可视化驱动方法使用户能够有效创建和分析学生模型,理解知识迁移机制,并高效执行微调操作。本文进一步探讨了该工作如何彰显交互式可视化方法在促进知识蒸馏及后续无代码微调方面的潜力,使其能更广泛地服务于具有领域特定需求的用户群体。