Recently, with increasing interest in pet healthcare, the demand for computer-aided diagnosis (CAD) systems in veterinary medicine has increased. The development of veterinary CAD has stagnated due to a lack of sufficient radiology data. To overcome the challenge, we propose a generative active learning framework based on a variational autoencoder. This approach aims to alleviate the scarcity of reliable data for CAD systems in veterinary medicine. This study utilizes datasets comprising cardiomegaly radiograph data. After removing annotations and standardizing images, we employed a framework for data augmentation, which consists of a data generation phase and a query phase for filtering the generated data. The experimental results revealed that as the data generated through this framework was added to the training data of the generative model, the frechet inception distance consistently decreased from 84.14 to 50.75 on the radiograph. Subsequently, when the generated data were incorporated into the training of the classification model, the false positive of the confusion matrix also improved from 0.16 to 0.66 on the radiograph. The proposed framework has the potential to address the challenges of data scarcity in medical CAD, contributing to its advancement.
翻译:近年来,随着宠物医疗关注度的提升,兽医学领域对计算机辅助诊断系统的需求日益增长。由于缺乏充足的放射学数据,兽医学计算机辅助诊断的发展停滞不前。为应对这一挑战,我们提出了一种基于变分自编码器的生成式主动学习框架。该方法旨在缓解兽医学计算机辅助诊断系统中可靠数据稀缺的问题。本研究使用了包含心脏肥大放射影像的数据集。在移除标注信息并标准化图像后,我们采用了一个数据增强框架,该框架包含数据生成阶段和用于筛选生成数据的查询阶段。实验结果表明,随着通过该框架生成的数据被加入生成模型的训练集,放射影像的Frechet初始距离从84.14持续下降至50.75。随后,当生成数据被纳入分类模型的训练后,混淆矩阵的假阳性率也由0.16改善至0.66。本框架有望解决医学计算机辅助诊断领域的数据稀缺难题,推动该领域的发展。