Concept Bottleneck Models (CBMs) have garnered much attention for their ability to elucidate the prediction process through a human-understandable concept layer. However, most previous studies focused on cases where the data, including concepts, are clean. In many scenarios, we always need to remove/insert some training data or new concepts from trained CBMs due to different reasons, such as privacy concerns, data mislabelling, spurious concepts, and concept annotation errors. Thus, the challenge of deriving efficient editable CBMs without retraining from scratch persists, particularly in large-scale applications. To address these challenges, we propose Editable Concept Bottleneck Models (ECBMs). Specifically, ECBMs support three different levels of data removal: concept-label-level, concept-level, and data-level. ECBMs enjoy mathematically rigorous closed-form approximations derived from influence functions that obviate the need for re-training. Experimental results demonstrate the efficiency and effectiveness of our ECBMs, affirming their adaptability within the realm of CBMs.
翻译:概念瓶颈模型(CBMs)因其能够通过人类可理解的概念层阐明预测过程而备受关注。然而,先前的研究大多集中于数据和概念均清洁的情况。在许多实际场景中,由于隐私问题、数据误标、伪概念或概念标注错误等原因,我们常常需要从已训练的CBMs中移除/插入部分训练数据或新概念。因此,如何在不从头重新训练的情况下实现高效的可编辑CBMs,尤其是在大规模应用中,仍然是一个挑战。为应对这些挑战,我们提出了可编辑概念瓶颈模型(ECBMs)。具体而言,ECBMs支持三种不同级别的数据移除:概念-标签级、概念级和数据级。ECBMs基于影响函数推导出数学上严谨的闭式近似解,从而避免了重新训练的需要。实验结果证明了我们提出的ECBMs的高效性和有效性,验证了其在CBM领域内的适应性。