LMT: Longitudinal Mixing Training, a Framework to Predict Disease Progression from a Single Image

Rachid Zeghlache,Pierre-Henri Conze,Mostafa El Habib Daho,Yihao Li,Hugo Le boite,Ramin Tadayoni,Pascal Massin,Béatrice Cochener,Ikram Brahim,Gwenolé Quellec,Mathieu Lamard

Longitudinal imaging is able to capture both static anatomical structures and dynamic changes in disease progression toward earlier and better patient-specific pathology management. However, conventional approaches rarely take advantage of longitudinal information for detection and prediction purposes, especially for Diabetic Retinopathy (DR). In the past years, Mix-up training and pretext tasks with longitudinal context have effectively enhanced DR classification results and captured disease progression. In the meantime, a novel type of neural network named Neural Ordinary Differential Equation (NODE) has been proposed for solving ordinary differential equations, with a neural network treated as a black box. By definition, NODE is well suited for solving time-related problems. In this paper, we propose to combine these three aspects to detect and predict DR progression. Our framework, Longitudinal Mixing Training (LMT), can be considered both as a regularizer and as a pretext task that encodes the disease progression in the latent space. Additionally, we evaluate the trained model weights on a downstream task with a longitudinal context using standard and longitudinal pretext tasks. We introduce a new way to train time-aware models using $t_{mix}$, a weighted average time between two consecutive examinations. We compare our approach to standard mixing training on DR classification using OPHDIAT a longitudinal retinal Color Fundus Photographs (CFP) dataset. We were able to predict whether an eye would develop a severe DR in the following visit using a single image, with an AUC of 0.798 compared to baseline results of 0.641. Our results indicate that our longitudinal pretext task can learn the progression of DR disease and that introducing $t_{mix}$ augmentation is beneficial for time-aware models.

翻译：摘要：纵向影像学能够同时捕捉静态解剖结构和疾病进展中的动态变化，有助于更早、更好地对患者进行个体化病理管理。然而，传统方法很少利用纵向信息进行检测与预测，尤其是在糖尿病视网膜病变（DR）领域。近年来，利用纵向背景的混合训练与预任务有效提升了DR分类结果并捕获了疾病进展。与此同时，一种新型神经网络——神经常微分方程（NODE）被提出，用于求解常微分方程，其中神经网络被视为黑箱模型。从定义上看，NODE特别适用于解决与时间相关的问题。本文提出结合上述三个方面来检测和预测DR进展。我们的框架——纵向混合训练（LMT），既可作为正则化器，也可作为在潜在空间中编码疾病进展的预任务。此外，我们在下游任务中利用标准预任务和纵向预任务，对带有纵向背景的模型权重进行了评估。我们引入了一种新的方法，通过使用$t_{mix}$（两次连续检查之间的加权平均时间）来训练时间感知模型。我们使用纵向视网膜彩色眼底照片（CFP）数据集OPHDIAT，将本方法与标准混合训练在DR分类任务上进行了比较。利用单张图像，我们能够预测眼睛是否会在下一次随访中发展为重度DR，AUC达到0.798，而基线结果仅为0.641。我们的结果表明，纵向预任务可以学习DR疾病的进展，且引入$t_{mix}$增强对于时间感知模型是有益的。