Multimodal and multicontrast image fusion via deep generative models

Recently, it has become progressively more evident that classic diagnostic labels are unable to reliably describe the complexity and variability of several clinical phenotypes. This is particularly true for a broad range of neuropsychiatric illnesses (e.g., depression, anxiety disorders, behavioral phenotypes). Patient heterogeneity can be better described by grouping individuals into novel categories based on empirically derived sections of intersecting continua that span across and beyond traditional categorical borders. In this context, neuroimaging data carry a wealth of spatiotemporally resolved information about each patient's brain. However, they are usually heavily collapsed a priori through procedures which are not learned as part of model training, and consequently not optimized for the downstream prediction task. This is because every individual participant usually comes with multiple whole-brain 3D imaging modalities often accompanied by a deep genotypic and phenotypic characterization, hence posing formidable computational challenges. In this paper we design a deep learning architecture based on generative models rooted in a modular approach and separable convolutional blocks to a) fuse multiple 3D neuroimaging modalities on a voxel-wise level, b) convert them into informative latent embeddings through heavy dimensionality reduction, c) maintain good generalizability and minimal information loss. As proof of concept, we test our architecture on the well characterized Human Connectome Project database demonstrating that our latent embeddings can be clustered into easily separable subject strata which, in turn, map to different phenotypical information which was not included in the embedding creation process. This may be of aid in predicting disease evolution as well as drug response, hence supporting mechanistic disease understanding and empowering clinical trials.

翻译：近来，传统诊断标签已愈发难以可靠描述多种临床表型的复杂性与变异性，这在广泛的精神神经疾病（如抑郁症、焦虑障碍、行为表型）中尤为突出。患者异质性可通过基于经验推导的交叉连续体截面将个体划分为新类别得以更好描述，这些类别跨越并超越传统分类边界。在此背景下，神经影像数据承载着每位患者大脑丰富的时空解析信息。然而，这些数据通常在模型训练之前通过非学习过程进行严重压缩，从而未能针对下游预测任务优化。这是由于每位受试者通常具有多种全脑三维成像模态，且常伴随深入的基因型与表型特征描述，因而构成了严峻的计算挑战。本文设计了一种基于生成模型的深度学习架构，该架构采用模块化方法与可分离卷积模块，以实现：a) 在体素级别融合多种三维神经影像模态，b) 通过高维度降维将其转化为信息丰富的潜在嵌入表征，c) 保持良好泛化能力与最小信息损失。作为概念验证，我们在特征明确的人类连接组计划数据库上测试该架构，证明潜在嵌入可聚类为易于区分的受试者分层，这些分层进一步映射至嵌入创建过程中未包含的不同表型信息。这有助于预测疾病演变及药物反应，从而支持机制性疾病理解并赋能临床试验。