Cell microscopy data are abundant; however, corresponding segmentation annotations remain scarce. Moreover, variations in cell types, imaging devices, and staining techniques introduce significant domain gaps between datasets. As a result, even large, pretrained segmentation models trained on diverse datasets (source datasets) struggle to generalize to unseen datasets (target datasets). To overcome this generalization problem, we propose CellStyle, which improves the segmentation quality of such models without requiring labels for the target dataset, thereby enabling zero-shot adaptation. CellStyle transfers the attributes of an unannotated target dataset, such as texture, color, and noise, to the annotated source dataset. This transfer is performed while preserving the cell shapes of the source images, ensuring that the existing source annotations can still be used while maintaining the visual characteristics of the target dataset. The styled synthetic images with the existing annotations enable the finetuning of a generalist segmentation model for application to the unannotated target data. We demonstrate that CellStyle significantly improves zero-shot cell segmentation performance across diverse datasets by finetuning multiple segmentation models on the style-transferred data. The code will be made publicly available.
翻译:细胞显微数据丰富,但相应的分割标注仍然稀缺。此外,细胞类型、成像设备和染色技术的差异在数据集之间引入了显著的领域差距。因此,即使在多样化数据集(源数据集)上训练的大型预训练分割模型,也难以泛化到未见过的数据集(目标数据集)。为克服这一泛化问题,我们提出了CellStyle,该方法无需目标数据集的标注即可提升此类模型的分割质量,从而实现零样本适应。CellStyle将未标注目标数据集的属性(如纹理、颜色和噪声)迁移到已标注的源数据集。这一迁移过程在保持源图像细胞形状的同时进行,确保现有源标注仍可使用,同时保留目标数据集的视觉特征。带有现有标注的风格化合成图像使得通用分割模型能够进行微调,以应用于未标注的目标数据。我们通过在多组风格迁移数据上微调多个分割模型,证明CellStyle能显著提升跨数据集的零样本细胞分割性能。代码将公开提供。