Hierarchical Concept Embedding & Pursuit for Interpretable Image Classification

Interpretable-by-design models are gaining traction in computer vision because they provide faithful explanations for their predictions. In image classification, these models typically recover human-interpretable concepts from an image and use them for classification. Sparse concept recovery methods leverage the latent space of vision-language models to represent image embeddings as sparse combinations of concept embeddings. However, by ignoring the hierarchical structure of semantic concepts, these methods may produce correct predictions with explanations that are inconsistent with the hierarchy. In this work, we propose Hierarchical Concept Embedding & Pursuit (HCEP), a framework that induces a hierarchy of concept embeddings in the latent space and performs hierarchical sparse coding to recover the concepts present in an image. Given a hierarchy of semantic concepts, we introduce a geometric construction for the corresponding hierarchy of embeddings. Under the assumption that the true concepts form a rooted path in the hierarchy, we derive sufficient conditions for their recovery in the embedding space. We further show that hierarchical sparse coding reliably recovers hierarchical concept embeddings, whereas standard sparse coding fails. Experiments on real-world datasets show that HCEP improves concept precision and recall compared to existing methods while maintaining competitive classification accuracy. Moreover, when the number of samples available for concept estimation and classifier training is limited, HCEP achieves superior classification accuracy and concept recovery. Our results demonstrate that incorporating hierarchical structure into sparse concept recovery leads to more faithful and interpretable image classification models.

翻译：可解释性设计模型因其能提供忠实预测解释而受到计算机视觉领域关注。在图像分类任务中，这类模型通常从图像中提取人类可理解的概念并据此进行分类。稀疏概念恢复方法借助视觉语言模型的潜在空间，将图像嵌入表示为概念嵌入的稀疏组合。然而，这些方法因忽略语义概念的层级结构，可能产生与层次关系不一致的解释性错误预测。本文提出分层概念嵌入与追踪（HCEP）框架，该框架在潜在空间中诱导概念嵌入的层次结构，并通过分层稀疏编码恢复图像中的概念。针对给定的语义概念层级，我们提出相应的嵌入层级几何构造方法。在真实概念构成层次根路径的假设下，推导了其在嵌入空间中可恢复的充分条件。进一步研究表明，分层稀疏编码能可靠恢复分层概念嵌入，而标准稀疏编码则无法实现。在真实数据集上的实验表明，HCEP在保持竞争性分类精度的同时，相比现有方法提升了概念精确率与召回率。此外，当用于概念估计和分类器训练的样本数量受限时，HCEP展现出更优的分类精度与概念恢复能力。实验结果证明，将层次结构融入稀疏概念恢复可构建更忠实且可解释的图像分类模型。