Multimodal and multicontrast image fusion via deep generative models

Recently, it has become progressively more evident that classic diagnostic labels are unable to reliably describe the complexity and variability of several clinical phenotypes. This is particularly true for a broad range of neuropsychiatric illnesses (e.g., depression, anxiety disorders, behavioral phenotypes). Patient heterogeneity can be better described by grouping individuals into novel categories based on empirically derived sections of intersecting continua that span across and beyond traditional categorical borders. In this context, neuroimaging data carry a wealth of spatiotemporally resolved information about each patient's brain. However, they are usually heavily collapsed a priori through procedures which are not learned as part of model training, and consequently not optimized for the downstream prediction task. This is because every individual participant usually comes with multiple whole-brain 3D imaging modalities often accompanied by a deep genotypic and phenotypic characterization, hence posing formidable computational challenges. In this paper we design a deep learning architecture based on generative models rooted in a modular approach and separable convolutional blocks to a) fuse multiple 3D neuroimaging modalities on a voxel-wise level, b) convert them into informative latent embeddings through heavy dimensionality reduction, c) maintain good generalizability and minimal information loss. As proof of concept, we test our architecture on the well characterized Human Connectome Project database demonstrating that our latent embeddings can be clustered into easily separable subject strata which, in turn, map to different phenotypical information which was not included in the embedding creation process. This may be of aid in predicting disease evolution as well as drug response, hence supporting mechanistic disease understanding and empowering clinical trials.

翻译：近来，传统诊断标签已越来越难以可靠描述多种临床表型的复杂性和变异性，这在神经精神疾病（如抑郁症、焦虑障碍、行为表型）中尤为突出。患者异质性可通过基于经验推导的连续谱交叉维度进行新型分类来更好描述，这些维度跨越且超越传统分类边界。在此背景下，神经影像数据携带着每位患者大脑丰富的时空解析信息，但这些数据通常通过非学习型预处理流程进行先验性大幅压缩，导致其未能针对下游预测任务进行优化。这是由于每个个体参与者通常拥有多种全脑3D成像模态，并伴随深入的基因型和表型表征，从而带来严峻的计算挑战。本文设计了一种基于生成模型的深度学习架构，该架构采用模块化方法和可分离卷积模块，以实现：a) 体素级别的多模态3D神经影像融合；b) 通过重度降维将其转换为信息丰富的潜在嵌入表征；c) 保持良好泛化性与最小信息损失。作为概念验证，我们在特征明确的人类连接组计划数据集上测试了该架构，结果表明我们的潜在嵌入可聚类为易于分离的受试者分层，这些分层进而映射至嵌入创建过程中未包含的不同表型信息。这有助于预测疾病演变及药物反应，从而支持机制性疾病理解并赋能临床试验。