Model initialization techniques are vital for improving the performance and reliability of deep learning models in medical computer vision applications. While much literature exists on non-medical images, the impacts on medical images, particularly chest X-rays (CXRs) are less understood. Addressing this gap, our study explores three deep model initialization techniques: Cold-start, Warm-start, and Shrink and Perturb start, focusing on adult and pediatric populations. We specifically focus on scenarios with periodically arriving data for training, thereby embracing the real-world scenarios of ongoing data influx and the need for model updates. We evaluate these models for generalizability against external adult and pediatric CXR datasets. We also propose novel ensemble methods: F-score-weighted Sequential Least-Squares Quadratic Programming (F-SLSQP) and Attention-Guided Ensembles with Learnable Fuzzy Softmax to aggregate weight parameters from multiple models to capitalize on their collective knowledge and complementary representations. We perform statistical significance tests with 95% confidence intervals and p-values to analyze model performance. Our evaluations indicate models initialized with ImageNet-pre-trained weights demonstrate superior generalizability over randomly initialized counterparts, contradicting some findings for non-medical images. Notably, ImageNet-pretrained models exhibit consistent performance during internal and external testing across different training scenarios. Weight-level ensembles of these models show significantly higher recall (p<0.05) during testing compared to individual models. Thus, our study accentuates the benefits of ImageNet-pretrained weight initialization, especially when used with weight-level ensembles, for creating robust and generalizable deep learning solutions.
翻译:模型初始化技术对于提升医学计算机视觉应用中深度学习模型的性能与可靠性至关重要。尽管非医学图像领域已有大量文献,但其对医学图像(尤其是胸部X光图像)的影响尚不明确。为填补这一研究空白,本研究探讨了三种深度模型初始化技术:冷启动、热启动以及收缩扰动启动,并聚焦成人与儿童群体。我们特别关注周期性到达的训练数据场景,从而适配真实世界中数据持续涌入与模型更新的需求。通过外部成人与儿童胸部X光数据集评估这些模型的泛化能力。同时,我们提出新型集成方法:基于F分数加权序列最小二乘二次规划(F-SLSQP)与注意力引导可学习模糊Softmax集成,以聚合多个模型的权重参数,充分利用其集体知识与互补表征。采用95%置信区间与p值的统计显著性检验分析模型性能。评估结果表明,使用ImageNet预训练权重初始化的模型在泛化能力上显著优于随机初始化模型,这与非医学图像领域的部分发现相悖。值得注意的是,ImageNet预训练模型在不同训练场景的内部与外部测试中均表现出一致性能。相较单一模型,此类模型的权重级集成在测试中展现出显著更高的召回率(p<0.05)。因此,本研究凸显了ImageNet预训练权重初始化的优势,尤其当结合权重级集成时,可构建稳健且具备泛化能力的深度学习解决方案。