SegTME-UNI2: A Foundation Model-Based Framework for Generalisable Multiclass Cell Segmentation and LLM-Driven Tumour Microenvironment Characterisation in Histopathology

翻译：SegTME-UNI2：一种基于基础模型的组织病理学通用多类细胞分割及大语言模型驱动的肿瘤微环境表征框架

Wan Siti Halimatul Munirah Wan Ahmad,Faris Syahmi Samidi,Mohammad Badal Ahmmed,Vimal Angela Thiviyanathan,Selvam James Thavaraj,Anwar P. P. Abdul Majeed

Characterising the tumour microenvironment (TME) from routine H&E-stained histology images requires simultaneous cell segmentation, feature extraction, and interpretable clinical reporting. We present SEGTME-UNI2, a unified framework addressing these requirements. Its core is UNI2-UPERHOVER, a dual-head segmentation model pairing the UNI2-H pathology foundation model (ViT-Giant, pretrained on >100M tiles from 100K slides) with two parallel UperNet decoders: one for six-class semantic segmentation and one for horizontal-vertical gradient regression enabling watershed-based nuclear instance separation. To address the lack of pixel-level annotations in large real-world repositories, UNI2-UPERHOVER undergoes a three-stage progressive pseudo-label curriculum. Each stage trains a fresh model without weight transfer, driving improvement entirely via increased pseudo-label quality: Stage 1: Uses human-annotated PanNuke (7,901 images, 189,744 nuclei, 0.25 um/pixel). Stage 2: Uses entropy-filtered pseudo-labels from the Stage 1 model on 271,711 TCGA-UT scale-0 patches (0.5 um/pixel). Stage 3: Uses pseudo-labels from the Stage 2 model on all 1,608,060 TCGA-UT patches across six resolution scales (0.5-1.0 um/pixel). Segmentation outputs feed a structured TME feature extraction pipeline computing 20+ per-patch compositional, morphological, spatial entropy, and intercellular distance metrics. These are encoded as JSON and passed to a fine-tuned NVIDIA BioNeMo GPT model to generate clinically interpretable TME narratives. Preliminary validation on held-out PanNuke and TCGA-UT partitions demonstrates framework feasibility and internal consistency. The pseudo-labelled TCGA-UT dataset and UNI2-UPERHOVER checkpoint are publicly released to support large-scale TME profiling and spatial biology research.

翻译：从常规H&E染色组织学图像中表征肿瘤微环境（TME）需要同时实现细胞分割、特征提取和可解释的临床报告。我们提出SEGTME-UNI2，一个满足上述需求的统一框架。其核心是UNI2-UPERHOVER，一种双头分割模型，将UNI2-H病理学基础模型（ViT-Giant，在来自10万张切片的超1亿张图块上预训练）与两个并行UperNet解码器配对：一个用于六类语义分割，另一个用于水平-垂直梯度回归，实现基于分水岭的细胞核实例分离。为解决大型真实数据集中像素级标注的缺失问题，UNI2-UPERHOVER采用渐进式伪标签三阶段课程学习。每个阶段训练全新模型（无权重迁移），通过提升伪标签质量驱动改进：阶段1：使用人工标注的PanNuke数据集（7,901张图像，189,744个细胞核，0.25微米/像素）。阶段2：使用阶段1模型对271,711个TCGA-UT尺度0图像块（0.5微米/像素）生成经熵过滤的伪标签。阶段3：使用阶段2模型对全部1,608,060个TCGA-UT图像块（覆盖0.5-1.0微米/像素六个分辨率尺度）生成伪标签。分割输出输入结构化的TME特征提取流程，计算每个图像块20余项组成、形态、空间熵及细胞间距离指标。这些指标编码为JSON后传入微调后的NVIDIA BioNeMo GPT模型，生成临床可解释的TME描述。在保留的PanNuke和TCGA-UT分区上的初步验证证明了框架的可行性与内部一致性。伪标注的TCGA-UT数据集与UNI2-UPERHOVER模型检查点已公开释放，以支持大规模TME分析和空间生物学研究。