Distilling BlackBox to Interpretable models for Efficient Transfer Learning

Building generalizable AI models is one of the primary challenges in the healthcare domain. While radiologists rely on generalizable descriptive rules of abnormality, Neural Network (NN) models suffer even with a slight shift in input distribution (e.g., scanner type). Fine-tuning a model to transfer knowledge from one domain to another requires a significant amount of labeled data in the target domain. In this paper, we develop an interpretable model that can be efficiently fine-tuned to an unseen target domain with minimal computational cost. We assume the interpretable component of NN to be approximately domain-invariant. However, interpretable models typically underperform compared to their Blackbox (BB) variants. We start with a BB in the source domain and distill it into a \emph{mixture} of shallow interpretable models using human-understandable concepts. As each interpretable model covers a subset of data, a mixture of interpretable models achieves comparable performance as BB. Further, we use the pseudo-labeling technique from semi-supervised learning (SSL) to learn the concept classifier in the target domain, followed by fine-tuning the interpretable models in the target domain. We evaluate our model using a real-life large-scale chest-X-ray (CXR) classification dataset. The code is available at: \url{https://github.com/batmanlab/MICCAI-2023-Route-interpret-repeat-CXRs}.

翻译：构建通用型AI模型是医疗健康领域的主要挑战之一。放射科医生依赖通用性的异常描述规则，而神经网络（NN）模型即使在输入分布发生微小变化（如扫描仪类型）时也会性能下降。通过微调模型将知识从一个领域迁移到另一个领域，需要在目标领域拥有大量标注数据。本文开发了一种可解释模型，能够以最小计算成本高效微调至未见过的目标领域。我们假设NN的可解释组件具有近似领域不变性。然而，可解释模型的性能通常逊于其黑盒（BB）变体。我们从源领域的BB模型出发，利用人类可理解的概念将其蒸馏为浅层可解释模型的混合体。由于每个可解释模型覆盖部分数据，可解释模型混合体的性能可与BB模型相当。此外，我们采用半监督学习（SSL）中的伪标签技术学习目标领域的概念分类器，随后在目标领域中微调可解释模型。我们利用真实大规模胸部X光（CXR）分类数据集对模型进行评估。代码参见：\url{https://github.com/batmanlab/MICCAI-2023-Route-interpret-repeat-CXRs}。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

83+阅读 · 2020年2月27日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日