This paper proposes an adaptive graph-based approach for multi-label image classification. Graph-based methods have been largely exploited in the field of multi-label classification, given their ability to model label correlations. Specifically, their effectiveness has been proven not only when considering a single domain but also when taking into account multiple domains. However, the topology of the used graph is not optimal as it is pre-defined heuristically. In addition, consecutive Graph Convolutional Network (GCN) aggregations tend to destroy the feature similarity. To overcome these issues, an architecture for learning the graph connectivity in an end-to-end fashion is introduced. This is done by integrating an attention-based mechanism and a similarity-preserving strategy. The proposed framework is then extended to multiple domains using an adversarial training scheme. Numerous experiments are reported on well-known single-domain and multi-domain benchmarks. The results demonstrate that our approach achieves competitive results in terms of mean Average Precision (mAP) and model size as compared to the state-of-the-art. The code will be made publicly available.
翻译:本文提出一种基于自适应图的多标签图像分类方法。鉴于图方法在建模标签相关性方面的能力,该类方法已被广泛应用于多标签分类领域。具体而言,无论是在单域设定下还是在多域场景中,图方法的有效性均已得到验证。然而,现有方法中使用的图拓扑结构因其启发式预定义方式而并非最优。此外,连续的图卷积网络聚合操作倾向于破坏特征相似性。为解决上述问题,本文提出一种以端到端方式学习图连接性的架构,通过整合注意力机制与相似性保持策略实现。随后,采用对抗训练方案将该框架扩展至多域场景。基于多个经典单域与多域基准数据集的大量实验表明,与现有最优方法相比,我们的方法在平均精确率和模型规模方面均取得了具有竞争力的结果。相关代码将公开提供。