Knee OsteoArthritis (KOA) is a prevalent musculoskeletal disorder that causes decreased mobility in seniors. The lack of sufficient data in the medical field is always a challenge for training a learning model due to the high cost of labelling. At present, deep neural network training strongly depends on data augmentation to improve the model's generalization capability and avoid over-fitting. However, existing data augmentation operations, such as rotation, gamma correction, etc., are designed based on the data itself, which does not substantially increase the data diversity. In this paper, we proposed a novel approach based on the Vision Transformer (ViT) model with Selective Shuffled Position Embedding (SSPE) and a ROI-exchange strategy to obtain different input sequences as a method of data augmentation for early detection of KOA (KL-0 vs KL-2). More specifically, we fixed and shuffled the position embedding of ROI and non-ROI patches, respectively. Then, for the input image, we randomly selected other images from the training set to exchange their ROI patches and thus obtained different input sequences. Finally, a hybrid loss function was derived using different loss functions with optimized weights. Experimental results show that our proposed approach is a valid method of data augmentation as it can significantly improve the model's classification performance.
翻译:膝骨关节炎(KOA)是一种常见的肌肉骨骼疾病,会导致老年人活动能力下降。由于标注成本高昂,医学领域缺乏足够的数据始终是训练学习模型面临的挑战。目前,深度神经网络训练高度依赖数据增强技术来提升模型泛化能力并避免过拟合。然而,现有数据增强操作如旋转、伽马校正等均基于数据本身设计,未能显著增加数据多样性。本文提出一种基于视觉Transformer(ViT)模型的新方法,结合选择性混洗位置编码(SSPE)与ROI交换策略,通过获取不同输入序列作为数据增强手段,实现KOA早期检测(KL-0级对比KL-2级)。具体而言,我们分别固定并混洗ROI与非ROI图像块的位置编码。随后,针对输入图像,从训练集中随机选取其他图像交换其ROI图像块,从而获得不同的输入序列。最后,通过不同损失函数与优化权重推导出混合损失函数。实验结果表明,本文提出的方法是一种有效的数据增强手段,能够显著提升模型的分类性能。