Computer vision-based methods have valuable use cases in precision medicine, and recognizing facial phenotypes of genetic disorders is one of them. Many genetic disorders are known to affect faces' visual appearance and geometry. Automated classification and similarity retrieval aid physicians in decision-making to diagnose possible genetic conditions as early as possible. Previous work has addressed the problem as a classification problem and used deep learning methods. The challenging issue in practice is the sparse label distribution and huge class imbalances across categories. Furthermore, most disorders have few labeled samples in training sets, making representation learning and generalization essential to acquiring a reliable feature descriptor. In this study, we used a facial recognition model trained on a large corpus of healthy individuals as a pre-task and transferred it to facial phenotype recognition. Furthermore, we created simple baselines of few-shot meta-learning methods to improve our base feature descriptor. Our quantitative results on GestaltMatcher Database show that our CNN baseline surpasses previous works, including GestaltMatcher, and few-shot meta-learning strategies improve retrieval performance in frequent and rare classes.
翻译:基于计算机视觉的方法在精准医学中具有重要应用价值,识别遗传性疾病的面部表型便是其中之一。众多遗传性疾病已知会影响面部的视觉外观与几何形态。自动化分类与相似性检索有助于临床医生尽早诊断潜在遗传病症。此前研究将此类问题定义为分类问题,并采用深度学习方法。实践中的挑战在于标签分布稀疏且类别间存在严重不平衡。此外,大多数疾病在训练集中仅有少量标注样本,这使得表征学习与泛化能力成为获取可靠特征描述子的关键。本研究将基于大量健康个体训练的面部识别模型作为预训练任务,并迁移至面部表型识别。我们进一步构建了简单的少样本元学习方法基线,以优化基础特征描述子。在GestaltMatcher数据库上的定量结果表明,我们的CNN基线超越了包括GestaltMatcher在内的先前研究,而少样本元学习策略则提升了频繁类别与稀有类别的检索性能。