In the field of image classification, existing methods often struggle with biased or ambiguous data, a prevalent issue in real-world scenarios. Current strategies, including semi-supervised learning and class blending, offer partial solutions but lack a definitive resolution. Addressing this gap, our paper introduces a novel strategy for generating high-quality labels in challenging datasets. Central to our approach is a clearly designed flowchart, based on a broad literature review, which enables the creation of reliable labels. We validate our methodology through a rigorous real-world test case in the biomedical field, specifically in deducing height reduction from vertebral imaging. Our empirical study, leveraging over 250,000 annotations, demonstrates the effectiveness of our strategies decisions compared to their alternatives.
翻译:在图像分类领域,现有方法在处理真实场景中普遍存在的偏倚或模糊数据时往往表现欠佳。当前策略(包括半监督学习和类别混合)虽能提供部分解决方案,但缺乏根本性突破。针对这一研究空白,本文提出了一种创新策略,用于在具有挑战性的数据集中生成高质量标签。该策略的核心是基于广泛文献综述精心设计的流程图,能够实现可靠标签的生成。我们通过生物医学领域一项严谨的真实世界测试案例(即从椎体影像推断身高缩减量)验证了所提方法的有效性。基于超过25万条标注的实证研究表明,与替代方案相比,本策略的各项决策均展现出显著优势。