In the research area of image super-resolution, Swin-transformer-based models are favored for their global spatial modeling and shifting window attention mechanism. However, existing methods often limit self-attention to non overlapping windows to cut costs and ignore the useful information that exists across channels. To address this issue, this paper introduces a novel model, the Hybrid Attention Aggregation Transformer (HAAT), designed to better leverage feature information. HAAT is constructed by integrating Swin-Dense-Residual-Connected Blocks (SDRCB) with Hybrid Grid Attention Blocks (HGAB). SDRCB expands the receptive field while maintaining a streamlined architecture, resulting in enhanced performance. HGAB incorporates channel attention, sparse attention, and window attention to improve nonlocal feature fusion and achieve more visually compelling results. Experimental evaluations demonstrate that HAAT surpasses state-of-the-art methods on benchmark datasets. Keywords: Image super-resolution, Computer vision, Attention mechanism, Transformer
翻译:在图像超分辨率研究领域,基于Swin-transformer的模型因其全局空间建模和移位窗口注意力机制而备受青睐。然而,现有方法通常将自注意力限制在非重叠窗口中以降低计算成本,忽略了跨通道存在的有效信息。为解决这一问题,本文提出一种新颖的模型——混合注意力聚合Transformer(HAAT),旨在更好地利用特征信息。HAAT通过集成Swin密集残差连接块(SDRCB)与混合网格注意力块(HGAB)构建而成。SDRCB在保持简洁架构的同时扩展了感受野,从而提升了模型性能。HGAB融合了通道注意力、稀疏注意力和窗口注意力,以改善非局部特征融合并获得更具视觉吸引力的重建结果。实验评估表明,HAAT在基准数据集上超越了现有先进方法。关键词:图像超分辨率,计算机视觉,注意力机制,Transformer