We present a novel diffusion-based approach to generate synthetic histopathological Whole Slide Images (WSIs) at an unprecedented gigapixel scale. Synthetic WSIs have many potential applications: They can augment training datasets to enhance the performance of many computational pathology applications. They allow the creation of synthesized copies of datasets that can be shared without violating privacy regulations. Or they can facilitate learning representations of WSIs without requiring data annotations. Despite this variety of applications, no existing deep-learning-based method generates WSIs at their typically high resolutions. Mainly due to the high computational complexity. Therefore, we propose a novel coarse-to-fine sampling scheme to tackle image generation of high-resolution WSIs. In this scheme, we increase the resolution of an initial low-resolution image to a high-resolution WSI. Particularly, a diffusion model sequentially adds fine details to images and increases their resolution. In our experiments, we train our method with WSIs from the TCGA-BRCA dataset. Additionally to quantitative evaluations, we also performed a user study with pathologists. The study results suggest that our generated WSIs resemble the structure of real WSIs.
翻译:我们提出了一种新颖的基于扩散的方法,用于在史无前例的千兆像素尺度上生成合成组织病理学全切片图像(WSIs)。合成WSI具有许多潜在应用:它们可以扩充训练数据集,从而提升众多计算病理学应用的性能;可以创建可共享的合成数据集副本而不违反隐私法规;或在无需数据标注的情况下促进WSI表示学习。尽管应用场景广泛,但现有深度学习方法均无法生成具有典型高分辨率的WSI,主要受限于高计算复杂度。为此,我们提出了一种新颖的由粗到细的采样方案来解决高分辨率WSI的生成问题。在该方案中,我们将初始低分辨率图像逐步提升至高分辨率WSI。具体而言,扩散模型依次向图像添加精细细节并提高其分辨率。实验中,我们使用TCGA-BRCA数据集的WSI训练了该方法。除定量评估外,我们还邀请了病理学家进行用户研究。结果表明,我们生成的WSI在结构上接近真实WSI。