The utilization of longitudinal datasets for glaucoma progression prediction offers a compelling approach to support early therapeutic interventions. Predominant methodologies in this domain have primarily focused on the direct prediction of glaucoma stage labels from longitudinal datasets. However, such methods may not adequately encapsulate the nuanced developmental trajectory of the disease. To enhance the diagnostic acumen of medical practitioners, we propose a novel diffusion-based model to predict prospective images by extrapolating from existing longitudinal fundus images of patients. The methodology delineated in this study distinctively leverages sequences of images as inputs. Subsequently, a time-aligned mask is employed to select a specific year for image generation. During the training phase, the time-aligned mask resolves the issue of irregular temporal intervals in longitudinal image sequence sampling. Additionally, we utilize a strategy of randomly masking a frame in the sequence to establish the ground truth. This methodology aids the network in continuously acquiring knowledge regarding the internal relationships among the sequences throughout the learning phase. Moreover, the introduction of textual labels is instrumental in categorizing images generated within the sequence. The empirical findings from the conducted experiments indicate that our proposed model not only effectively generates longitudinal data but also significantly improves the precision of downstream classification tasks.
翻译:利用纵向数据集进行青光眼进展预测为支持早期治疗干预提供了引人注目的途径。该领域的主流方法主要集中于从纵向数据集中直接预测青光眼分期标签。然而,此类方法可能无法充分捕捉疾病细微的发展轨迹。为增强医疗从业者的诊断敏锐度,我们提出了一种新颖的基于扩散的模型,通过从患者现有的纵向眼底图像进行推演来预测前瞻性图像。本研究阐述的方法独特之处在于利用图像序列作为输入。随后,采用时间对齐掩码来选择特定年份进行图像生成。在训练阶段,时间对齐掩码解决了纵向图像序列采样中时间间隔不规则的问题。此外,我们采用随机掩码序列中某一帧的策略来建立真实基准。该方法有助于网络在整个学习阶段持续获取关于序列间内部关系的知识。此外,引入文本标签有助于对序列内生成的图像进行分类。实验获得的实证结果表明,我们提出的模型不仅能有效生成纵向数据,还能显著提高下游分类任务的精度。