Omnidirectional image super-resolution (ODISR) aims to upscale low-resolution (LR) omnidirectional images (ODIs) to high-resolution (HR), addressing the growing demand for detailed visual content across a $180^{\circ}\times360^{\circ}$ viewport. Existing methods are limited by simple degradation assumptions (e.g., bicubic downsampling), which fail to capture the complex, unknown real-world degradation processes. Recent diffusion-based approaches suffer from slow inference due to their hundreds of sampling steps and frequent pixel-latent space conversions. To tackle these challenges, in this paper, we propose RealOSR, a novel diffusion-based approach for real-world ODISR (Real-ODISR) with single-step diffusion denoising. To sufficiently exploit the input information, RealOSR introduces a lightweight domain alignment module, which facilitates the efficient injection of LR ODI into the single-step latent denoising. Additionally, to better utilize the rich semantic and multi-scale feature modeling ability of denoising UNet, we develop a latent unfolding module that simulates the gradient descent process directly in latent space. Experimental results demonstrate that RealOSR outperforms previous methods in both ODI recovery quality and efficiency. Compared to the recent state-of-the-art diffusion-based ODISR method, OmniSSR, RealOSR achieves significant improvements in visual quality and over \textbf{200$\times$} inference acceleration. Our code and models will be released.
翻译:全景图像超分辨率(ODISR)旨在将低分辨率全景图像上采样至高分辨率,以满足对180°×360°视口内细节视觉内容日益增长的需求。现有方法受限于简单的退化假设(如双三次下采样),无法捕捉复杂且未知的真实世界退化过程。近期基于扩散模型的方法因其数百次采样步骤和频繁的像素-潜在空间转换,导致推理速度缓慢。为应对这些挑战,本文提出RealOSR——一种面向真实世界ODISR的新型扩散模型方法,该方法仅需单步扩散去噪。为充分挖掘输入信息,RealOSR引入了轻量级域对齐模块,以促进低分辨率全景图像在单步潜在去噪中的高效注入。此外,为更好地利用去噪UNet丰富的语义和多尺度特征建模能力,我们开发了潜在展开模块,直接在潜在空间中模拟梯度下降过程。实验结果表明,RealOSR在全景图像恢复质量和效率方面均优于现有方法。相较于近期基于扩散模型的全景图像超分辨率方法OmniSSR,RealOSR在视觉质量上取得显著提升,并实现超过**200倍**的推理加速。我们的代码与模型将公开发布。