PPTFormer: Pseudo Multi-Perspective Transformer for UAV Segmentation

The ascension of Unmanned Aerial Vehicles (UAVs) in various fields necessitates effective UAV image segmentation, which faces challenges due to the dynamic perspectives of UAV-captured images. Traditional segmentation algorithms falter as they cannot accurately mimic the complexity of UAV perspectives, and the cost of obtaining multi-perspective labeled datasets is prohibitive. To address these issues, we introduce the PPTFormer, a novel \textbf{P}seudo Multi-\textbf{P}erspective \textbf{T}rans\textbf{former} network that revolutionizes UAV image segmentation. Our approach circumvents the need for actual multi-perspective data by creating pseudo perspectives for enhanced multi-perspective learning. The PPTFormer network boasts Perspective Decomposition, novel Perspective Prototypes, and a specialized encoder and decoder that together achieve superior segmentation results through Pseudo Multi-Perspective Attention (PMP Attention) and fusion. Our experiments demonstrate that PPTFormer achieves state-of-the-art performance across five UAV segmentation datasets, confirming its capability to effectively simulate UAV flight perspectives and significantly advance segmentation precision. This work presents a pioneering leap in UAV scene understanding and sets a new benchmark for future developments in semantic segmentation.

翻译：随着无人机（UAV）在各领域的广泛应用，高效的无人机图像分割变得至关重要，但由于无人机拍摄图像的动态视角特性，该任务面临诸多挑战。传统分割算法因无法准确模拟无人机视角的复杂性而效果受限，且获取多视角标注数据集的成本极高。为解决这些问题，本文提出了PPTFormer，这是一种创新的**伪多视角Transformer**网络，为无人机图像分割带来了革命性改进。该方法通过生成伪视角来增强多视角学习，从而避免了对真实多视角数据的依赖。PPTFormer网络包含视角分解模块、新颖的视角原型设计以及专有的编码器-解码器结构，通过伪多视角注意力（PMP Attention）与融合机制协同工作，实现了卓越的分割性能。实验表明，PPTFormer在五个无人机分割数据集上均达到了最先进的性能，证实了其有效模拟无人机飞行视角并显著提升分割精度的能力。本研究在无人机场景理解领域实现了突破性进展，为语义分割的未来发展设立了新的基准。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日