Learning-based MIMO detection has shown strong empirical performance, yet existing methods typically rely on fixed-depth architectures without explicitly modeling the progressive refinement of symbol estimates. In this paper, we revisit MIMO detection from a flow matching perspective and propose the Soft Graph Diffusion Transformer (SGDiT), which reformulates detection as a noise-level-conditioned denoising process that progressively transforms a Gaussian initialization toward the posterior conditioned on channel observations. An adaptive layer normalization (AdaLN)-conditioned soft graph transformer is employed to parameterize the denoising dynamics, enabling stage-aware information integration between observation and symbol domains. To better align with the discrete nature of symbol detection, we further adopt a cross-entropy-based training objective that directly models bit-wise posterior probabilities, providing a more suitable inductive bias than conventional regression-based formulations. Experimental results across various MIMO system configurations demonstrate that SGDiT achieves competitive bit error rate (BER) performance compared with representative baselines. Furthermore, the proposed model exhibits good generalization capability across different channel conditions. Overall, the SGDiT framework provides an effective and practical approach for neural MIMO detection.
翻译:基于学习的MIMO检测方法已展现出强劲的实证性能,然而现有方法通常依赖固定深度的网络架构,未显式建模符号估计的渐进细化过程。本文从流匹配视角重新审视MIMO检测问题,提出软图扩散Transformer(SGDiT),将检测任务重构为基于噪声水平条件化的去噪过程——该过程逐步将高斯初始化分布向以信道观测为条件的后验分布转换。我们采用基于自适应层归一化(AdaLN)的软图Transformer参数化去噪动力学,实现观测域与符号域之间的阶段感知信息融合。为更好契合符号检测的离散特性,进一步采用基于交叉熵的训练目标直接建模比特级后验概率,相较于传统回归式建模提供了更合适的归纳偏置。多种MIMO系统配置下的实验结果表明,SGDiT在误码率(BER)性能上与代表性基线方法相当。此外,该模型在不同信道条件下展现出良好的泛化能力。整体而言,SGDiT框架为神经MIMO检测提供了一种有效且实用的方案。