Concept Bottleneck Models (CBMs) have garnered much attention for their ability to elucidate the prediction process through a humanunderstandable concept layer. However, most previous studies focused on cases where the data, including concepts, are clean. In many scenarios, we often need to remove/insert some training data or new concepts from trained CBMs for reasons such as privacy concerns, data mislabelling, spurious concepts, and concept annotation errors. Thus, deriving efficient editable CBMs without retraining from scratch remains a challenge, particularly in large-scale applications. To address these challenges, we propose Editable Concept Bottleneck Models (ECBMs). Specifically, ECBMs support three different levels of data removal: concept-label-level, concept-level, and data-level. ECBMs enjoy mathematically rigorous closed-form approximations derived from influence functions that obviate the need for retraining. Experimental results demonstrate the efficiency and adaptability of our ECBMs, affirming their practical value in CBMs.
翻译:概念瓶颈模型(CBMs)因其能够通过人类可理解的概念层阐明预测过程而备受关注。然而,先前的大多数研究集中于数据和概念均保持洁净的情况。在许多实际场景中,出于隐私顾虑、数据误标、虚假概念或概念标注错误等原因,我们常常需要从已训练的CBMs中移除/插入部分训练数据或新概念。因此,如何在不从头重新训练的情况下,推导出高效的可编辑CBMs仍然是一个挑战,尤其是在大规模应用中。为应对这些挑战,我们提出了可编辑概念瓶颈模型(ECBMs)。具体而言,ECBMs支持三种不同级别的数据移除操作:概念-标签级、概念级和数据级。ECBMs基于影响函数推导出数学上严谨的闭式近似解,从而避免了重新训练的需要。实验结果证明了我们提出的ECBMs的高效性和适应性,肯定了其在CBMs中的实用价值。