Intra-class variations in the open world lead to various challenges in classification tasks. To overcome these challenges, fine-grained classification was introduced, and many approaches were proposed. Some rely on locating and using distinguishable local parts within images to achieve invariance to viewpoint changes, intra-class differences, and local part deformations. Our approach, which is inspired by P2P-Net, offers an end-to-end trainable attention-based parts alignment module, where we replace the graph-matching component used in it with a self-attention mechanism. The attention module is able to learn the optimal arrangement of parts while attending to each other, before contributing to the global loss.
翻译:开放世界中的类内变异给分类任务带来了各种挑战。为应对这些挑战,细粒度分类被提出,并涌现出多种方法。部分方法依赖定位并利用图像中可区分的局部区域,以实现对视角变化、类内差异和局部形变的不变性。受P2P-Net启发,我们提出了一种端到端可训练的基于注意力的部件对齐模块,其中用自注意力机制替代了原有的图匹配组件。该注意力模块能够在相互关注的同时学习部件的最优排布,进而对全局损失做出贡献。