Region-Graph Optimal Transport Routing for Mixture-of-Experts Whole-Slide Image Classification

Multiple Instance Learning (MIL) is the dominant framework for gigapixel whole-slide image (WSI) classification in computational pathology. However, current MIL aggregators route all instances through a shared pathway, constraining their capacity to specialise across the pathological heterogeneity inherent in each slide. Mixture-of-Experts (MoE) methods offer a natural remedy by partitioning instances across specialised expert subnetworks; yet unconstrained softmax routing may yield highly imbalanced utilisation, where one or a few experts absorb most routing mass, collapsing the mixture back to a near-single-pathway solution. To address these limitations, we propose ROAM (Region-graph OptimAl-transport Mixture-of-experts), a spatially aware MoE-MIL aggregator that routes region tokens to expert poolers via capacity-constrained entropic optimal transport, promoting balanced expert utilisation by construction. ROAM operates on spatial region tokens, obtained by compressing dense patch bags into spatially binned units that align routing with local tissue neighbourhoods and introduces two key mechanisms: (i) region-to-expert assignment formulated as entropic optimal transport (Sinkhorn) with explicit per slide capacity marginals, enforcing balanced expert utilisation without auxiliary load-balancing losses; and (ii) graph-regularised Sinkhorn iterations that diffuse routing assignments over the spatial region graph, encouraging neighbouring regions to coherently route to the same experts. Evaluated on four WSI benchmarks with frozen foundation-model patch embeddings, ROAM achieves performance competitive against strong MIL and MoE baselines, and on NSCLC generalisation (TCGA-CPTAC) reaches external AUC 0.845 +- 0.019.

翻译：多实例学习（Multiple Instance Learning, MIL）是计算病理学中用于十亿像素全切片图像（Whole-Slide Image, WSI）分类的主流框架。然而，当前MIL聚合器将所有实例通过共享通路路由，限制了其针对每个切片固有的病理异质性进行特化的能力。混合专家（Mixture-of-Experts, MoE）方法通过将实例分配给专门的专家子网络提供了一种自然解决方案；然而，无约束的Softmax路由可能导致高度不平衡的利用率，即一个或少数几个专家吸收大部分路由质量，使混合物退化为近乎单通路的解决方案。为解决这些局限，我们提出ROAM（Region-graph OptimAl-transport Mixture-of-experts，区域图最优传输混合专家），这是一种空间感知的MoE-MIL聚合器，通过带容量约束的熵正则化最优传输将区域令牌路由至专家池化器，从而从构建层面促进专家平衡利用。ROAM操作于空间区域令牌（通过将密集的图块包压缩为空间分箱单元获得，使路由与局部组织邻域对齐），并引入两个关键机制：(i) 将区域到专家分配表述为具有显式每切片容量边际约束的熵正则化最优传输（Sinkhorn），无需辅助负载均衡损失即可强制实现专家平衡利用；(ii) 图正则化Sinkhorn迭代，将路由分配沿空间区域图扩散，鼓励相邻区域连贯地路由至相同专家。在四个WSI基准测试中，使用冻结的基础模型图块嵌入进行评估，ROAM实现了与强MIL和MoE基线相当的性能，并在NSCLC泛化（TCGA-CPTAC）上达到外部AUC 0.845 ± 0.019。