Reproducibility in scientific work has been becoming increasingly important in research communities such as machine learning, natural language processing, and computer vision communities due to the rapid development of the research domains supported by recent advances in deep learning. In this work, we present a significantly upgraded version of torchdistill, a modular-driven coding-free deep learning framework significantly upgraded from the initial release, which supports only image classification and object detection tasks for reproducible knowledge distillation experiments. To demonstrate that the upgraded framework can support more tasks with third-party libraries, we reproduce the GLUE benchmark results of BERT models using a script based on the upgraded torchdistill, harmonizing with various Hugging Face libraries. All the 27 fine-tuned BERT models and configurations to reproduce the results are published at Hugging Face, and the model weights have already been widely used in research communities. We also reimplement popular small-sized models and new knowledge distillation methods and perform additional experiments for computer vision tasks.
翻译:科学工作的可复现性在机器学习、自然语言处理和计算机视觉等研究社区中日益重要,这得益于近期深度学习进展推动的研究领域快速发展。本文介绍了 torchdistill 的重大升级版本——一个从初始版本大幅改进的模块化、免编码深度学习框架,初始版本仅支持图像分类和目标检测任务,用于可复现的知识蒸馏实验。为展示升级后的框架可通过第三方库支持更多任务,我们基于升级版 torchdistill,结合多种 Hugging Face 库,使用脚本复现了 BERT 模型的 GLUE 基准测试结果。所有 27 个微调后的 BERT 模型及用于复现结果的配置均已发布至 Hugging Face,其模型权重已在研究社区广泛使用。我们还针对计算机视觉任务实现了流行的轻量级模型和新知识蒸馏方法,并进行了额外实验。