Scarcity of annotated data, particularly for rare or atypical morphologies, present significant challenges for cell and nuclei segmentation in computational pathology. While manual annotation is labor-intensive and costly, synthetic data offers a cost-effective alternative. We introduce a Multimodal Semantic Diffusion Model (MSDM) for generating realistic pixel-precise image-mask pairs for cell and nuclei segmentation. By conditioning the generative process with cellular/nuclear morphologies (using horizontal and vertical maps), RGB color characteristics, and BERT-encoded assay/indication metadata, MSDM generates datasests with desired morphological properties. These heterogeneous modalities are integrated via multi-head cross-attention, enabling fine-grained control over the generated images. Quantitative analysis demonstrates that synthetic images closely match real data, with low Wasserstein distances between embeddings of generated and real images under matching biological conditions. The incorporation of these synthetic samples, exemplified by columnar cells, significantly improves segmentation model accuracy on columnar cells. This strategy systematically enriches data sets, directly targeting model deficiencies. We highlight the effectiveness of multimodal diffusion-based augmentation for advancing the robustness and generalizability of cell and nuclei segmentation models. Thereby, we pave the way for broader application of generative models in computational pathology.
翻译:在计算病理学中,标注数据的稀缺,特别是对于罕见或非典型形态的标注,给细胞与细胞核分割带来了重大挑战。虽然人工标注费力且成本高昂,但合成数据提供了一种经济有效的替代方案。我们提出了一种多模态语义扩散模型(MSDM),用于生成用于细胞与细胞核分割的逼真且像素级精确的图像-掩码对。通过使用细胞/细胞核形态(利用水平和垂直映射图)、RGB颜色特征以及BERT编码的检测/指征元数据来调节生成过程,MSDM能够生成具有所需形态特性的数据集。这些异构模态通过多头交叉注意力进行整合,从而实现对生成图像的细粒度控制。定量分析表明,合成图像与真实数据高度匹配,在匹配的生物学条件下,生成图像与真实图像的嵌入向量之间的Wasserstein距离较低。以柱状细胞为例,引入这些合成样本显著提高了分割模型在柱状细胞上的准确性。该策略系统地丰富了数据集,直接针对模型的不足之处。我们强调了基于多模态扩散的数据增强在提升细胞与细胞核分割模型的鲁棒性和泛化能力方面的有效性。由此,我们为生成模型在计算病理学中更广泛的应用铺平了道路。