Deep learning applications in Magnetic Resonance Imaging (MRI) predominantly operate on reconstructed magnitude images, a process that discards phase information and requires computationally expensive transforms. Standard neural network architectures rely on local operations (convolutions or grid-patches) that are ill-suited for the global, non-local nature of raw frequency-domain (k-Space) data. In this work, we propose a novel complex-valued Vision Transformer (kViT) designed to perform classification directly on k-Space data. To bridge the geometric disconnect between current architectures and MRI physics, we introduce a radial k-Space patching strategy that respects the spectral energy distribution of the frequency-domain. Extensive experiments on the fastMRI and in-house datasets demonstrate that our approach achieves classification performance competitive with state-of-the-art image-domain baselines (ResNet, EfficientNet, ViT). Crucially, kViT exhibits superior robustness to high acceleration factors and offers a paradigm shift in computational efficiency, reducing VRAM consumption during training by up to 68$\times$ compared to standard methods. This establishes a pathway for resource-efficient, direct-from-scanner AI analysis.
翻译:磁共振成像(MRI)中的深度学习应用主要基于重建后的幅度图像进行操作,这一过程会丢弃相位信息,且需要计算量昂贵的变换。标准神经网络架构依赖于局部操作(卷积或网格分块),这些操作与原始频域(k空间)数据的全局、非局部特性不相适应。在本研究中,我们提出了一种新颖的复数值视觉Transformer(kViT),旨在直接对k空间数据进行分类。为弥合现有架构与MRI物理之间的几何脱节,我们引入了一种尊重频域谱能量分布的径向k空间分块策略。在fastMRI及内部数据集上的大量实验表明,我们的方法达到了与最先进的图像域基线模型(ResNet、EfficientNet、ViT)相当的分类性能。至关重要的是,kViT对高加速因子表现出卓越的鲁棒性,并在计算效率上实现了范式转变——与标准方法相比,训练期间的VRAM消耗最高可降低68倍。这为资源高效、直接从扫描仪进行AI分析开辟了路径。