Concept Bottleneck Models (CBMs) enhance interpretability by predicting human-understandable concepts as intermediate representations. However, existing CBMs often suffer from input-to-concept mapping bias and limited controllability, which restricts their practical value, directly damage the responsibility of strategy from concept-based methods. We propose a lightweight Disentangled Concept Bottleneck Model (LDCBM) that automatically groups visual features into semantically meaningful components without region annotation. By introducing a filter grouping loss and joint concept supervision, our method improves the alignment between visual patterns and concepts, enabling more transparent and robust decision-making. Notably, Experiments on three diverse datasets demonstrate that LDCBM achieves higher concept and class accuracy, outperforming previous CBMs in both interpretability and classification performance. By grounding concepts in visual evidence, our method overcomes a fundamental limitation of prior models and enhances the reliability of interpretable AI.
翻译:概念瓶颈模型(CBMs)通过预测人类可理解的概念作为中间表示来增强可解释性。然而,现有CBMs常受输入到概念映射偏差和有限可控性的困扰,这限制了其实用价值,并直接损害了基于概念方法的策略责任。我们提出了一种轻量化解缠概念瓶颈模型(LDCBM),该模型无需区域标注即可自动将视觉特征分组为语义上有意义的组件。通过引入滤波器分组损失和联合概念监督,我们的方法改善了视觉模式与概念之间的对齐,实现了更透明和鲁棒的决策。值得注意的是,在三个不同数据集上的实验表明,LDCBM实现了更高的概念和类别准确率,在可解释性和分类性能上均优于以往的CBMs。通过将概念建立在视觉证据之上,我们的方法克服了先前模型的一个根本性局限,并增强了可解释人工智能的可靠性。