Connectional-Style-Guided Contextual Representation Learning for Brain Disease Diagnosis

Structural magnetic resonance imaging (sMRI) has shown great clinical value and has been widely used in deep learning (DL) based computer-aided brain disease diagnosis. Previous approaches focused on local shapes and textures in sMRI that may be significant only within a particular domain. The learned representations are likely to contain spurious information and have a poor generalization ability in other diseases and datasets. To facilitate capturing meaningful and robust features, it is necessary to first comprehensively understand the intrinsic pattern of the brain that is not restricted within a single data/task domain. Considering that the brain is a complex connectome of interlinked neurons, the connectional properties in the brain have strong biological significance, which is shared across multiple domains and covers most pathological information. In this work, we propose a connectional style contextual representation learning model (CS-CRL) to capture the intrinsic pattern of the brain, used for multiple brain disease diagnosis. Specifically, it has a vision transformer (ViT) encoder and leverages mask reconstruction as the proxy task and Gram matrices to guide the representation of connectional information. It facilitates the capture of global context and the aggregation of features with biological plausibility. The results indicate that CS-CRL achieves superior accuracy in multiple brain disease diagnosis tasks across six datasets and three diseases and outperforms state-of-the-art models. Furthermore, we demonstrate that CS-CRL captures more brain-network-like properties, better aggregates features, is easier to optimize and is more robust to noise, which explains its superiority in theory. Our source code will be released soon.

翻译：结构磁共振成像在基于深度学习的计算机辅助脑疾病诊断中展现出了重要的临床价值并被广泛采用。以往方法主要关注局部形态与纹理特征，但这些特征可能仅在特定领域内具有显著性。所学习的表示容易包含虚假信息，并在其他疾病和数据集上泛化能力较差。为捕获有意义且鲁棒的深层特征，首先需要全面理解不受单个数据或任务领域限制的脑固有模式。考虑到大脑是由互联神经元构成的复杂连接组，其连接特性具有强生物学意义，该特性跨多个领域共享且覆盖大部分病理信息。本文提出一种连接风格上下文表示学习模型，用于捕获脑固有模式并服务于多种脑疾病诊断。具体而言，该模型采用视觉Transformer编码器，通过掩码重建作为代理任务，并利用格拉姆矩阵引导连接信息的表示。这种设计有助于捕获全局上下文并聚合具有生物学合理性的特征。实验结果表明，CS-CRL在涵盖六种数据集和三种疾病的多个脑疾病诊断任务中均取得卓越准确性，性能超越现有最优模型。此外，我们通过实验证明CS-CRL能够捕获更多类脑网络特性，实现更优的特征聚合，更易优化且对噪声具有更强鲁棒性，从理论上解释了其优越性。本研究的源代码将稍后公开。