Laparoscopic surgery offers minimally invasive procedures with better patient outcomes, but smoke presence challenges visibility and safety. Existing learning-based methods demand large datasets and high computational resources. We propose the Progressive Frequency-Aware Network (PFAN), a lightweight GAN framework for laparoscopic image desmoking, combining the strengths of CNN and Transformer for progressive information extraction in the frequency domain. PFAN features CNN-based Multi-scale Bottleneck-Inverting (MBI) Blocks for capturing local high-frequency information and Locally-Enhanced Axial Attention Transformers (LAT) for efficiently handling global low-frequency information. PFAN efficiently desmokes laparoscopic images even with limited training data. Our method outperforms state-of-the-art approaches in PSNR, SSIM, CIEDE2000, and visual quality on the Cholec80 dataset and retains only 629K parameters. Our code and models are made publicly available at: https://github.com/jlzcode/PFAN.
翻译:腹腔镜手术具有创伤小、患者预后好的优势,但烟雾的存在会挑战手术可视性与安全性。现有基于学习的方法需要大规模数据集和高计算资源。我们提出渐进频域感知网络(PFAN)——一种用于腹腔镜图像去烟的轻量级GAN框架,结合CNN与Transformer在频域进行渐进式信息提取。PFAN采用基于CNN的多尺度瓶颈反演(MBI)模块捕获局部高频信息,以及局部增强轴向注意力变换器(LAT)高效处理全局低频信息。即使训练数据有限,PFAN也能有效去除腹腔镜图像中的烟雾。在Cholec80数据集上,我们的方法在PSNR、SSIM、CIEDE2000及视觉质量方面均优于现有最优方法,且仅保留629K参数。我们的代码和模型已开源:https://github.com/jlzcode/PFAN。