Frequency Domain Decomposition Translation for Enhanced Medical Image Translation Using GANs

Medical Image-to-image translation is a key task in computer vision and generative artificial intelligence, and it is highly applicable to medical image analysis. GAN-based methods are the mainstream image translation methods, but they often ignore the variation and distribution of images in the frequency domain, or only take simple measures to align high-frequency information, which can lead to distortion and low quality of the generated images. To solve these problems, we propose a novel method called frequency domain decomposition translation (FDDT). This method decomposes the original image into a high-frequency component and a low-frequency component, with the high-frequency component containing the details and identity information, and the low-frequency component containing the style information. Next, the high-frequency and low-frequency components of the transformed image are aligned with the transformed results of the high-frequency and low-frequency components of the original image in the same frequency band in the spatial domain, thus preserving the identity information of the image while destroying as little stylistic information of the image as possible. We conduct extensive experiments on MRI images and natural images with FDDT and several mainstream baseline models, and we use four evaluation metrics to assess the quality of the generated images. Compared with the baseline models, optimally, FDDT can reduce Fr\'echet inception distance by up to 24.4%, structural similarity by up to 4.4%, peak signal-to-noise ratio by up to 5.8%, and mean squared error by up to 31%. Compared with the previous method, optimally, FDDT can reduce Fr\'echet inception distance by up to 23.7%, structural similarity by up to 1.8%, peak signal-to-noise ratio by up to 6.8%, and mean squared error by up to 31.6%.

翻译：医学图像到图像的翻译是计算机视觉与生成式人工智能中的关键任务，在医学图像分析中具有高度应用价值。基于GAN的方法是主流图像翻译方法，但往往忽略图像在频率域中的变化和分布，或仅采取简单措施对齐高频信息，这可能导致生成图像失真且质量低下。为解决这些问题，我们提出一种名为频率域分解翻译（FDDT）的新方法。该方法将原始图像分解为高频分量和低频分量，其中高频分量包含细节和身份信息，低频分量包含风格信息。随后，变换后图像的高频和低频分量在空间域中与原始图像对应频带的变换结果对齐，从而在尽可能保留图像风格信息的同时保持图像的身份信息。我们使用FDDT及多种主流基线模型，对MRI图像和自然图像进行了大量实验，并采用四项评估指标衡量生成图像质量。与基线模型相比，FDDT最多可降低Fr\'echet初始距离24.4%、结构相似性4.4%、峰值信噪比5.8%及均方误差31%。与先前方法相比，FDDT最多可降低Fr\'echet初始距离23.7%、结构相似性1.8%、峰值信噪比6.8%及均方误差31.6%。