Deep learning techniques have achieved remarkable success in the semantic segmentation of remote sensing images and in land-use change detection. Nevertheless, their real-time deployment on edge platforms remains constrained by decoder complexity. Herein, we introduce LightFormer, a lightweight decoder for time-critical tasks that involve unstructured targets, such as disaster assessment, unmanned aerial vehicle search-and-rescue, and cultural heritage monitoring. LightFormer employs a feature-fusion and refinement module built on channel processing and a learnable gating mechanism to aggregate multi-scale, multi-range information efficiently, which drastically curtails model complexity. Furthermore, we propose a spatial information selection module (SISM) that integrates long-range attention with a detail preservation branch to capture spatial dependencies across multiple scales, thereby substantially improving the recognition of unstructured targets in complex scenes. On the ISPRS Vaihingen benchmark, LightFormer attains 99.9% of GLFFNet's mIoU (83.9% vs. 84.0%) while requiring only 14.7% of its FLOPs and 15.9% of its parameters, thus achieving an excellent accuracy-efficiency trade-off. Consistent results on LoveDA, ISPRS Potsdam, RescueNet, and FloodNet further demonstrate its robustness and superior perception of unstructured objects. These findings highlight LightFormer as a practical solution for remote sensing applications where both computational economy and high-precision segmentation are imperative.
翻译:深度学习技术在遥感图像语义分割及土地利用变化检测领域已取得显著成功。然而,其在边缘平台上的实时部署仍受限于解码器复杂度。本文提出LightFormer,一种面向时间敏感任务(如灾害评估、无人机搜救和文化遗产监测)中非结构化目标处理的轻量级解码器。LightFormer采用基于通道处理与可学习门控机制构建的特征融合与精化模块,以高效聚合多尺度、多范围信息,从而大幅降低模型复杂度。此外,我们提出空间信息选择模块,该模块通过整合长程注意力机制与细节保留分支,捕获多尺度空间依赖关系,显著提升了复杂场景中非结构化目标的识别能力。在ISPRS Vaihingen基准测试中,LightFormer达到了GLFFNet模型mIoU指标的99.9%(83.9%对比84.0%),而仅需其14.7%的浮点运算量与15.9%的参数量,实现了优异的精度-效率平衡。在LoveDA、ISPRS Potsdam、RescueNet及FloodNet数据集上的稳定结果进一步验证了其鲁棒性及对非结构化目标的卓越感知能力。这些发现表明,LightFormer为同时要求计算经济性与高精度分割的遥感应用提供了实用解决方案。