Low-light image enhancement remains a persistent challenge in computer vision, where state-of-the-art models are often hampered by hardware constraints and computational inefficiency, particularly at high resolutions. While foundational architectures like transformers and diffusion models have advanced the field, their computational complexity limits their deployment on edge devices. We introduce ExpoMamba, a novel architecture that integrates a frequency-aware state-space model within a modified U-Net. ExpoMamba is designed to address mixed-exposure challenges by decoupling the modeling of amplitude (intensity) and phase (structure) in the frequency domain. This allows for targeted enhancement, making it highly effective for real-time applications, including downstream tasks like object detection and segmentation. Our experiments on six benchmark datasets show that ExpoMamba is up to 2-3x faster than competing models and achieves a 6.8\% PSNR improvement, establishing a new state-of-the-art in efficient, high-quality low-light enhancement. Source code: https://www.github.com/eashanadhikarla/ExpoMamba.
翻译:低光图像增强始终是计算机视觉领域的一项持久挑战,现有最先进的模型常常受限于硬件约束和计算效率低下,尤其是在高分辨率场景下。尽管Transformer和扩散模型等基础架构推动了该领域的发展,但其计算复杂性限制了它们在边缘设备上的部署。我们提出了ExpoMamba,一种在改进的U-Net中集成了频率感知状态空间模型的新型架构。ExpoMamba旨在通过在频域解耦振幅(强度)和相位(结构)的建模来解决混合曝光挑战。这使得能够进行有针对性的增强,从而使其对于实时应用(包括目标检测和分割等下游任务)非常有效。我们在六个基准数据集上的实验表明,ExpoMamba比竞争模型快2-3倍,并实现了6.8%的PSNR提升,从而在高效、高质量的低光增强方面确立了新的最先进水平。源代码:https://www.github.com/eashanadhikarla/ExpoMamba。