Low-light image enhancement remains a challenging task in computer vision, with existing state-of-the-art models often limited by hardware constraints and computational inefficiencies, particularly in handling high-resolution images. Recent foundation models, such as transformers and diffusion models, despite their efficacy in various domains, are limited in use on edge devices due to their computational complexity and slow inference times. We introduce ExpoMamba, a novel architecture that integrates components of the frequency state space within a modified U-Net, offering a blend of efficiency and effectiveness. This model is specifically optimized to address mixed exposure challenges, a common issue in low-light image enhancement, while ensuring computational efficiency. Our experiments demonstrate that ExpoMamba enhances low-light images up to 2-3x faster than traditional models with an inference time of 36.6 ms and achieves a PSNR improvement of approximately 15-20% over competing models, making it highly suitable for real-time image processing applications.
翻译:低光图像增强在计算机视觉中仍是一项具有挑战性的任务,现有的先进模型常受限于硬件约束和计算效率低下,尤其是在处理高分辨率图像时。尽管Transformer和扩散模型等近期基础模型在多个领域表现出色,但其计算复杂性和较慢的推理速度限制了它们在边缘设备上的应用。本文提出ExpoMamba,一种新颖的架构,它将频域状态空间组件集成到改进的U-Net中,实现了效率与性能的平衡。该模型专门针对低光图像增强中常见的混合曝光挑战进行了优化,同时确保了计算效率。实验表明,ExpoMamba增强低光图像的速度比传统模型快2-3倍(推理时间为36.6毫秒),其PSNR指标相较于竞争模型提升了约15-20%,使其非常适用于实时图像处理应用。