Inferring spatial transcriptomics (ST) from histology enables scalable histogenomic profiling, yet current methods are largely restricted to single-tissue models. This fragmentation fails to leverage biological principles shared across cancer types and hinders application to data-scarce scenarios. While pan-cancer training offers a solution, the resulting heterogeneity challenges monolithic architectures. To bridge this gap, we introduce MoLF (Mixture-of-Latent-Flow), a generative model for pan-cancer histogenomic prediction. MoLF leverages a conditional Flow Matching objective to map noise to the gene latent manifold, parameterized by a Mixture-of-Experts (MoE) velocity field. By dynamically routing inputs to specialized sub-networks, this architecture effectively decouples the optimization of diverse tissue patterns. Our experiments demonstrate that MoLF establishes a new state-of-the-art, consistently outperforming both specialized and foundation model baselines on pan-cancer benchmarks. Furthermore, MoLF exhibits zero-shot generalization to cross-species data, suggesting it captures fundamental, conserved histo-molecular mechanisms.
翻译:从组织学推断空间转录组学(ST)能够实现可扩展的组织基因组分析,然而现有方法主要局限于单组织模型。这种碎片化模式未能充分利用跨癌症类型共享的生物学原理,并阻碍了在数据稀缺场景中的应用。虽然泛癌训练提供了一种解决方案,但由此产生的异质性对单一架构提出了挑战。为弥合这一差距,我们提出了MoLF(潜在流混合模型),一种用于泛癌组织基因组预测的生成模型。MoLF利用条件流匹配目标将噪声映射到基因潜在流形,该流形通过混合专家(MoE)速度场进行参数化。通过动态将输入路由至专用子网络,该架构有效解耦了不同组织模式的优化过程。实验表明,MoLF在泛癌基准测试中持续优于专用模型和基础模型基线,确立了新的最优性能。此外,MoLF展现出对跨物种数据的零样本泛化能力,表明其捕捉到了根本性的、保守的组织-分子机制。