The recent advances in deep learning (DL) have been accelerated by access to large-scale data and compute. These large-scale resources have been used to train progressively larger models which are resource intensive in terms of compute, data, energy, and carbon emissions. These costs are becoming a new type of entry barrier to researchers and practitioners with limited access to resources at such scale, particularly in the Global South. In this work, we take a comprehensive look at the landscape of existing DL models for medical image analysis tasks and demonstrate their usefulness in settings where resources are limited. To account for the resource consumption of DL models, we introduce a novel measure to estimate the performance per resource unit, which we call the PePR score. Using a diverse family of 131 unique DL architectures (spanning 1M to 130M trainable parameters) and three medical image datasets, we capture trends about the performance-resource trade-offs. In applications like medical image analysis, we argue that small-scale, specialized models are better than striving for large-scale models. Furthermore, we show that using existing pretrained models that are fine-tuned on new data can significantly reduce the computational resources and data required compared to training models from scratch. We hope this work will encourage the community to focus on improving AI equity by developing methods and models with smaller resource footprints.
翻译:深度学习(DL)的最新进展得益于大规模数据和计算资源的获取。这些大规模资源被用于训练日益庞大的模型,这些模型在计算、数据、能耗和碳排放方面均消耗巨大资源。这些成本正成为资源获取受限的研究者和实践者(尤其是全球南方地区)面临的新型准入壁垒。在本研究中,我们全面审视了现有医学图像分析任务中的深度学习模型格局,并论证了它们在资源有限环境中的实用性。为量化深度学习模型的资源消耗,我们引入了一种新颖的度量指标——性能每资源单位(PePR分数)。通过采用包含131种独特DL架构(参数量从100万到1.3亿不等)的多样化模型族和三个医学图像数据集,我们揭示了性能与资源权衡的变化趋势。在医学图像分析等应用场景中,我们认为小规模专业化模型优于盲目追求大规模模型。此外,我们证明相较于从头训练模型,使用现有预训练模型进行新数据微调可显著降低计算资源和数据需求。我们期望这项工作能推动学术界关注通过开发资源消耗更少的方法与模型来促进人工智能公平性。