RANGER: Sparsely-Gated Mixture-of-Experts with Adaptive Retrieval Re-ranking for Pathology Report Generation

Pathology report generation remains a relatively under-explored downstream task, primarily due to the gigapixel scale and complex morphological heterogeneity of Whole Slide Images (WSIs). Existing pathology report generation frameworks typically employ transformer architectures, relying on a homogeneous decoder architecture and static knowledge retrieval integration. Such architectures limit generative specialization and may introduce noisy external guidance during the report generation process. To address these limitations, we propose RANGER, a sparsely-gated Mixture-of-Experts (MoE) framework with adaptive retrieval re-ranking for pathology report generation. Specifically, we integrate a sparsely gated MoE into the decoder, along with noisy top-$k$ routing and load-balancing regularization, to enable dynamic expert specialization across various diagnostic patterns. Additionally, we introduce an adaptive retrieval re-ranking module that selectively refines retrieved memory from a knowledge base before integration, reducing noise and improving semantic alignment based on visual feature representations. We perform extensive experiments on the PathText-BRCA dataset and demonstrate consistent improvements over existing approaches across standard natural language generation metrics. Our full RANGER model achieves optimal performance on PathText dataset, reaching BLEU-1 to BLEU-4 scores of 0.4598, 0.3044, 0.2036, and 0.1435, respectively, with METEOR of 0.1883, and ROUGE-L of 0.3038, validating the effectiveness of dynamic expert routing and adaptive knowledge refinement for semantically grounded pathology report generation.

翻译：病理报告生成作为一个下游任务仍相对缺乏探索，这主要归因于全切片图像（WSIs）的千兆像素级尺度及复杂的形态异质性。现有的病理报告生成框架通常采用Transformer架构，依赖于同质的解码器架构和静态知识检索集成。此类架构限制了生成的专业化能力，并可能在报告生成过程中引入噪声外部引导。为应对这些局限，本文提出RANGER——一种集成自适应检索重排序的稀疏门控专家混合（MoE）框架用于病理报告生成。具体而言，我们在解码器中集成稀疏门控MoE，并结合噪声top-$k$路由与负载均衡正则化，以实现跨不同诊断模式的动态专家专业化。此外，我们引入自适应检索重排序模块，在集成前对知识库中检索的记忆进行选择性优化，从而基于视觉特征表示减少噪声并提升语义对齐。我们在PathText-BRCA数据集上进行了大量实验，结果表明在标准自然语言生成指标上持续优于现有方法。完整的RANGER模型在PathText数据集上取得最优性能，其BLEU-1至BLEU-4分数分别达到0.4598、0.3044、0.2036和0.1435，METEOR为0.1883，ROUGE-L为0.3038，验证了动态专家路由与自适应知识优化对于语义基底的病理报告生成的有效性。