The development of generative models in the past decade has allowed for hyperrealistic data synthesis. While potentially beneficial, this synthetic data generation process has been relatively underexplored in cancer histopathology. One algorithm for synthesising a realistic image is diffusion; it iteratively converts an image to noise and learns the recovery process from this noise [Wang and Vastola, 2023]. While effective, it is highly computationally expensive for high-resolution images, rendering it infeasible for histopathology. The development of Variational Autoencoders (VAEs) has allowed us to learn the representation of complex high-resolution images in a latent space. A vital by-product of this is the ability to compress high-resolution images to space and recover them lossless. The marriage of diffusion and VAEs allows us to carry out diffusion in the latent space of an autoencoder, enabling us to leverage the realistic generative capabilities of diffusion while maintaining reasonable computational requirements. Rombach et al. [2021b] and Yellapragada et al. [2023] build foundational models for this task, paving the way to generate realistic histopathology images. In this paper, we discuss the pitfalls of current methods, namely [Yellapragada et al., 2023] and resolve critical errors while proposing improvements along the way. Our methods achieve an FID score of 21.11, beating its SOTA counterparts in [Yellapragada et al., 2023] by 1.2 FID, while presenting a train-time GPU memory usage reduction of 7%.
翻译:过去十年中生成模型的发展使得超现实数据合成成为可能。尽管具有潜在益处,但这一合成数据生成过程在癌症组织病理学领域尚未得到充分探索。扩散是一种生成逼真图像的算法;它通过迭代方式将图像转换为噪声,并学习从该噪声中恢复图像的过程[Wang and Vastola, 2023]。该方法虽有效,但对于高分辨率图像计算成本极高,使其难以应用于组织病理学领域。变分自编码器(VAEs)的发展使我们能够学习复杂高分辨率图像在潜在空间中的表示。其关键副产品是能够将高分辨率图像压缩至该空间并无损恢复。扩散模型与VAEs的结合使我们能够在自编码器的潜在空间中进行扩散,从而在保持合理计算需求的同时利用扩散模型的逼真生成能力。Rombach等人[2021b]和Yellapragada等人[2023]为此任务构建了基础模型,为生成逼真的组织病理学图像开辟了道路。本文讨论了现有方法(特别是[Yellapragada et al., 2023])的缺陷,在修正关键错误的同时提出了改进方案。我们的方法取得了21.11的FID分数,较[Yellapragada et al., 2023]中的SOTA模型提升了1.2个FID值,同时训练阶段的GPU内存使用量降低了7%。