This paper proposes an adaptive graph-based approach for multi-label image classification. Graph-based methods have been largely exploited in the field of multi-label classification, given their ability to model label correlations. Specifically, their effectiveness has been proven not only when considering a single domain but also when taking into account multiple domains. However, the topology of the used graph is not optimal as it is pre-defined heuristically. In addition, consecutive Graph Convolutional Network (GCN) aggregations tend to destroy the feature similarity. To overcome these issues, an architecture for learning the graph connectivity in an end-to-end fashion is introduced. This is done by integrating an attention-based mechanism and a similarity-preserving strategy. The proposed framework is then extended to multiple domains using an adversarial training scheme. Numerous experiments are reported on well-known single-domain and multi-domain benchmarks. The results demonstrate that our approach achieves competitive results in terms of mean Average Precision (mAP) and model size as compared to the state-of-the-art. The code will be made publicly available.
翻译:本文提出了一种基于自适应图方法的多标签图像分类技术。由于图方法能够建模标签相关性,其在多标签分类领域得到了广泛利用。具体而言,不仅在单域场景中,在多域场景下也已被证明具有有效性。然而,现有图拓扑结构因采用启发式预定义而并非最优。此外,连续图卷积网络(GCN)聚合操作往往破坏特征相似性。为解决上述问题,本文引入了一种以端到端方式学习图连接性的架构,通过集成注意力机制和相似性保持策略实现。随后,采用对抗训练方案将所提框架扩展至多域场景。在多个知名单域和多域基准上的大量实验表明,与现有最先进方法相比,本方法在平均精度均值(mAP)和模型规模方面均取得了具有竞争力的结果。相关代码将公开发布。