In recent years, artificial intelligence has significantly advanced medical image segmentation. Nonetheless, challenges remain, including efficient 3D medical image processing across diverse modalities and handling data variability. In this work, we introduce Hierarchical Soft Mixture-of-Experts (HoME), a two-level token-routing layer for efficient long-context modeling, specifically designed for 3D medical image segmentation. Built on the Mamba Selective State Space Model (SSM) backbone, HoME enhances sequential modeling through adaptive expert routing. In the first level, a Soft Mixture-of-Experts (SMoE) layer partitions input sequences into local groups, routing tokens to specialized per-group experts for localized feature extraction. The second level aggregates these outputs through a global SMoE layer, enabling cross-group information fusion and global context refinement. This hierarchical design, combining local expert routing with global expert refinement, enhances generalizability and segmentation performance, surpassing state-of-the-art results across datasets from the three most widely used 3D medical imaging modalities and varying data qualities. The code is publicly available at https://github.com/gmum/MambaHoME.
翻译:近年来,人工智能在医学图像分割领域取得了显著进展。然而,挑战依然存在,包括跨多种模态的高效三维医学图像处理以及处理数据变异性。在本工作中,我们提出了层次化软专家混合模型(HoME),这是一个用于高效长上下文建模的两级令牌路由层,专为三维医学图像分割设计。HoME 构建在 Mamba 选择性状态空间模型(SSM)骨干网络上,通过自适应专家路由来增强序列建模能力。在第一级,一个软专家混合(SMoE)层将输入序列划分为局部组,并将令牌路由到专门的每组专家,以进行局部特征提取。第二级通过一个全局 SMoE 层聚合这些输出,实现跨组信息融合和全局上下文精炼。这种结合了局部专家路由与全局专家精炼的层次化设计,增强了模型的泛化能力和分割性能,在来自三种最广泛使用的三维医学成像模态及不同数据质量的数据集上超越了现有最佳结果。代码公开于 https://github.com/gmum/MambaHoME。