The objective of single image dehazing is to restore hazy images and produce clear, high-quality visuals. Traditional convolutional models struggle with long-range dependencies due to their limited receptive field size. While Transformers excel at capturing such dependencies, their quadratic computational complexity in relation to feature map resolution makes them less suitable for pixel-to-pixel dense prediction tasks. Moreover, fixed kernels or tokens in most models do not adapt well to varying blur sizes, resulting in suboptimal dehazing performance. In this study, we introduce a novel dehazing network based on Parallel Stripe Cross Attention (PCSA) with a multi-scale strategy. PCSA efficiently integrates long-range dependencies by simultaneously capturing horizontal and vertical relationships, allowing each pixel to capture contextual cues from an expanded spatial domain. To handle different sizes and shapes of blurs flexibly, We employs a channel-wise design with varying convolutional kernel sizes and strip lengths in each PCSA to capture context information at different scales.Additionally, we incorporate a softmax-based adaptive weighting mechanism within PCSA to prioritize and leverage more critical features.
翻译:单图像去雾的目标是恢复有雾图像并生成清晰、高质量的视觉效果。传统卷积模型由于感受野大小有限,难以捕捉长距离依赖关系。尽管Transformer擅长捕捉此类依赖,但其计算复杂度与特征图分辨率呈二次增长,使其不太适用于像素到像素的密集预测任务。此外,大多数模型中的固定核或标记无法很好地适应不同大小的模糊,导致去雾性能欠佳。在本研究中,我们提出了一种基于并行条带交叉注意力(PCSA)并采用多尺度策略的新型去雾网络。PCSA通过同时捕捉水平和垂直关系,高效整合长距离依赖,使每个像素能从扩展的空间域中获取上下文线索。为灵活处理不同大小和形状的模糊,我们在每个PCSA中采用通道级设计,结合不同卷积核大小和条带长度来捕捉不同尺度的上下文信息。此外,我们在PCSA中引入基于softmax的自适应加权机制,以优先利用更关键的特征。