Unlike the field of visual scene recognition where tremendous advances have taken place due to the availability of very large datasets to train deep neural networks, inference from medical images is often hampered by the fact that only small amounts of data may be available. When working with very small dataset problems, of the order of a few hundred items of data, the power of deep learning may still be exploited by using a model pre-trained on natural images as a feature extractor and carrying out classic pattern recognition techniques in this feature space, the so-called few-shot learning problem. In regimes where the dimension of this feature space is comparable to or even larger than the number of items of data, dimensionality reduction is a necessity and is often achieved by principal component analysis, i.e., singular value decomposition (SVD). In this paper, noting the inappropriateness of using SVD for this setting, we usher in and explore two alternatives based on discriminant analysis and non-negative matrix factorization (NMF). Using 14 different datasets spanning $11$ distinct disease types, we demonstrate that discriminant subspaces at low dimensions achieve significant improvements over SVD-based subspaces and the original feature space. We also show that NMF at modest dimensions is a competitive alternative to SVD in this setting.
翻译:与视觉场景识别领域因大规模数据集推动深度神经网络取得巨大进展不同,医学影像推理常因可用数据量极少而受限。当处理仅有数百条数据的极小规模数据集问题时,仍可通过将预训练于自然图像的模型用作特征提取器,并在该特征空间中执行经典模式识别技术来利用深度学习能力——即所谓的小样本学习问题。当特征空间维度与数据量相当甚至更大时,降维成为必要手段,通常采用主成分分析(即奇异值分解SVD)实现。本文指出SVD在此场景下的不适用性,引入并探索了基于判别分析与非负矩阵分解(NMF)的两种替代方案。通过涵盖11种不同疾病类型的14个数据集,我们证明了低维判别子空间相比SVD子空间及原始特征空间能实现显著性能提升。同时表明,在该场景下适度维度的NMF是SVD的可行替代方案。