Advancements in clinical treatment are increasingly constrained by the limitations of supervised learning techniques, which depend heavily on large volumes of annotated data. The annotation process is not only costly but also demands substantial time from clinical specialists. Addressing this issue, we introduce the S4MI (Self-Supervision and Semi-Supervision for Medical Imaging) pipeline, a novel approach that leverages the advancements in self-supervised and semi-supervised learning. These techniques engage in auxiliary tasks that do not require labeling, thus simplifying the scaling of machine supervision compared to fully-supervised methods. Our study benchmarks these techniques on three distinct medical imaging datasets to evaluate their effectiveness in classification and segmentation tasks. Remarkably, we observed that self-supervised learning with only 10% of the annotation surpassed the performance of full annotation in the classification of most datasets. Similarly, the semi-supervised approach demonstrated superior outcomes in segmentation, outperforming fully-supervised methods with 50% fewer labels across all datasets. In line with our commitment to contributing to the scientific community, we have made the S4MI code openly accessible, allowing for broader application and further development of these methods.
翻译:临床治疗的进步日益受限于监督学习技术的瓶颈,这类技术严重依赖大规模标注数据。标注过程不仅成本高昂,还需要临床专家投入大量时间。针对这一问题,我们提出S4MI(医学影像自监督与半监督)流水线这一创新方法,充分利用自监督与半监督学习的最新进展。这些技术通过参与无需标注的辅助任务,相较于全监督方法简化了机器监督的扩展过程。本研究在三个不同医学影像数据集上对这些技术进行基准测试,评估其在分类与分割任务中的有效性。值得关注的是,我们观察到仅使用10%标注的自监督学习在多数数据集的分类任务中超越了全标注性能。同样地,半监督方法在分割任务中展现出更优结果,在所有数据集上以减少50%标注量的条件下超越全监督方法。秉持对科学界贡献的承诺,我们已将S4MI代码完全开放,以推动这些方法的广泛应用与进一步发展。