The traditional Transformer model encounters challenges with variable-length input sequences, particularly in Hyperspectral Image Classification (HSIC), leading to efficiency and scalability concerns. To overcome this, we propose a pyramid-based hierarchical transformer (PyFormer). This innovative approach organizes input data hierarchically into segments, each representing distinct abstraction levels, thereby enhancing processing efficiency for lengthy sequences. At each level, a dedicated transformer module is applied, effectively capturing both local and global context. Spatial and spectral information flow within the hierarchy facilitates communication and abstraction propagation. Integration of outputs from different levels culminates in the final input representation. Experimental results underscore the superiority of the proposed method over traditional approaches. Additionally, the incorporation of disjoint samples augments robustness and reliability, thereby highlighting the potential of our approach in advancing HSIC. The source code is available at https://github.com/mahmad00/PyFormer.
翻译:传统Transformer模型在处理变长输入序列时面临挑战,尤其是在高光谱图像分类(HSIC)任务中,导致效率和可扩展性问题。为解决这一问题,我们提出了一种基于金字塔结构的分层Transformer(PyFormer)。该创新方法将输入数据按层次组织成片段,每个片段代表不同的抽象层级,从而提升长序列的处理效率。在每个层级中应用专用Transformer模块,有效捕获局部和全局上下文信息。层级内的空间与光谱信息流促进了通信与抽象传播。不同层级输出进行整合后形成最终输入表示。实验结果证实了该方法相较于传统方法的优越性。此外,引入不相交样本增强了鲁棒性与可靠性,凸显了该方法在推动HSIC发展中的潜力。源代码已公开于 https://github.com/mahmad00/PyFormer。