Hematoxylin and Eosin (H&E) staining is a widely used sample preparation procedure for enhancing the saturation of tissue sections and the contrast between nuclei and cytoplasm in histology images for medical diagnostics. However, various factors, such as the differences in the reagents used, result in high variability in the colors of the stains actually recorded. This variability poses a challenge in achieving generalization for machine-learning based computer-aided diagnostic tools. To desensitize the learned models to stain variations, we propose the Generative Stain Augmentation Network (G-SAN) -- a GAN-based framework that augments a collection of cell images with simulated yet realistic stain variations. At its core, G-SAN uses a novel and highly computationally efficient Laplacian Pyramid (LP) based generator architecture, that is capable of disentangling stain from cell morphology. Through the task of patch classification and nucleus segmentation, we show that using G-SAN-augmented training data provides on average 15.7% improvement in F1 score and 7.3% improvement in panoptic quality, respectively. Our code is available at https://github.com/lifangda01/GSAN-Demo.
翻译:苏木精-伊红(H&E)染色是一种广泛应用于医学诊断的组织学图像样本制备技术,用于增强组织切片的饱和度以及细胞核与细胞质之间的对比度。然而,试剂差异等多种因素会导致实际记录的染色颜色存在高度变异性。这种变异性对基于机器学习的计算机辅助诊断工具的泛化能力构成了挑战。为降低所学模型对染色变化的敏感性,我们提出生成式染色增强网络(G-SAN)——一种基于生成对抗网络的框架,通过模拟真实染色变化的增强图像来扩充细胞图像数据集。其核心采用新颖且计算高效的基于拉普拉斯金字塔(LP)的生成器架构,该架构能够解耦染色与细胞形态特征。通过补丁分类和细胞核分割任务验证,采用G-SAN增强训练数据可使F1分数平均提升15.7%,全景质量平均提升7.3%。我们的代码已开源在https://github.com/lifangda01/GSAN-Demo。