PePR: Performance Per Resource Unit as a Metric to Promote Small-Scale Deep Learning in Medical Image Analysis

from arxiv, Accepted to be published at the Northern Lights Deep Learning Conference (NLDL), 2025. Source code available at https://github.com/saintslab/PePR

The recent advances in deep learning (DL) have been accelerated by access to large-scale data and compute. These large-scale resources have been used to train progressively larger models which are resource intensive in terms of compute, data, energy, and carbon emissions. These costs are becoming a new type of entry barrier to researchers and practitioners with limited access to resources at such scale, particularly in the Global South. In this work, we take a comprehensive look at the landscape of existing DL models for medical image analysis tasks and demonstrate their usefulness in settings where resources are limited. To account for the resource consumption of DL models, we introduce a novel measure to estimate the performance per resource unit, which we call the PePR score. Using a diverse family of 131 unique DL architectures (spanning 1M to 130M trainable parameters) and three medical image datasets, we capture trends about the performance-resource trade-offs. In applications like medical image analysis, we argue that small-scale, specialized models are better than striving for large-scale models. Furthermore, we show that using existing pretrained models that are fine-tuned on new data can significantly reduce the computational resources and data required compared to training models from scratch. We hope this work will encourage the community to focus on improving AI equity by developing methods and models with smaller resource footprints.

翻译：深度学习（DL）的最新进展得益于大规模数据和计算资源的获取。这些大规模资源被用于训练日益庞大的模型，这些模型在计算、数据、能耗和碳排放方面均消耗巨大资源。这些成本正成为资源获取受限的研究者和实践者（尤其是全球南方地区）面临的新型准入壁垒。在本研究中，我们全面审视了现有医学图像分析任务中的深度学习模型格局，并论证了它们在资源有限环境中的实用性。为量化深度学习模型的资源消耗，我们引入了一种新颖的度量指标——性能每资源单位（PePR分数）。通过采用包含131种独特DL架构（参数量从100万到1.3亿不等）的多样化模型族和三个医学图像数据集，我们揭示了性能与资源权衡的变化趋势。在医学图像分析等应用场景中，我们认为小规模专业化模型优于盲目追求大规模模型。此外，我们证明相较于从头训练模型，使用现有预训练模型进行新数据微调可显著降低计算资源和数据需求。我们期望这项工作能推动学术界关注通过开发资源消耗更少的方法与模型来促进人工智能公平性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/