This paper presents a comprehensive study and benchmark on Efficient Perceptual Super-Resolution (EPSR). While significant progress has been made in efficient PSNR-oriented super resolution, approaches focusing on perceptual quality metrics remain relatively inefficient. Motivated by this gap, we aim to replicate or improve the perceptual results of Real-ESRGAN while meeting strict efficiency constraints: a maximum of 5M parameters and 2000 GFLOPs, calculated for an input size of 960x540 pixels. The proposed solutions were evaluated on a novel dataset consisting of 500 test images of 4K resolution, each degraded using multiple degradation types, without providing the original high-quality counterparts. This design aims to reflect realistic deployment conditions and serves as a diverse and challenging benchmark. The top-performing approach manages to outperform Real-ESRGAN across all benchmark datasets, demonstrating the potential of efficient methods in the perceptual domain. This paper establishes the modern baselines for efficient perceptual super resolution.
翻译:本文针对高效感知超分辨率(EPSR)进行了全面的研究与基准测试。尽管面向 PSNR 的高效超分辨率方法已取得显著进展,但专注于感知质量指标的方法仍相对低效。基于这一差距,本研究旨在复现或改进 Real-ESRGAN 的感知结果,同时满足严格的效率约束:参数量不超过 5M,计算量不超过 2000 GFLOPs(以 960x540 像素输入尺寸计算)。所提出的方法在一个新颖的数据集上进行了评估,该数据集包含 500 张 4K 分辨率的测试图像,每张图像均经过多种退化类型处理,且未提供原始高质量图像。这种设计旨在反映真实的部署条件,并作为一个多样且具有挑战性的基准。表现最佳的方法在所有基准数据集上均超越了 Real-ESRGAN,证明了高效方法在感知领域的潜力。本文为高效感知超分辨率建立了现代基准。