Concept bottleneck models (CBM) are a popular way of creating more interpretable neural networks by having hidden layer neurons correspond to human-understandable concepts. However, existing CBMs and their variants have two crucial limitations: first, they need to collect labeled data for each of the predefined concepts, which is time consuming and labor intensive; second, the accuracy of a CBM is often significantly lower than that of a standard neural network, especially on more complex datasets. This poor performance creates a barrier for adopting CBMs in practical real world applications. Motivated by these challenges, we propose Label-free CBM which is a novel framework to transform any neural network into an interpretable CBM without labeled concept data, while retaining a high accuracy. Our Label-free CBM has many advantages, it is: scalable - we present the first CBM scaled to ImageNet, efficient - creating a CBM takes only a few hours even for very large datasets, and automated - training it for a new dataset requires minimal human effort. Our code is available at https://github.com/Trustworthy-ML-Lab/Label-free-CBM. Finally, in Appendix B we conduct a large scale user evaluation of the interpretability of our method.
翻译:概念瓶颈模型(CBM)是一种通过使隐藏层神经元与人类可理解概念对应来创建更具可解释性神经网络的流行方法。然而,现有CBM及其变体存在两个关键限制:首先,它们需要为每个预定义概念收集标注数据,这一过程既耗时又费力;其次,CBM的准确率通常显著低于标准神经网络,尤其在复杂数据集上表现更差。这种性能缺陷阻碍了CBM在实际应用中的推广。针对这些挑战,我们提出无标签CBM——一种新颖框架,可在无需概念标注数据的前提下将任意神经网络转化为可解释的CBM,同时保持高精度。我们的无标签CBM具有多项优势:可扩展性——首次将CBM扩展到ImageNet规模;高效性——即使处理超大规模数据集,创建CBM也仅需数小时;自动化——新数据集的训练过程几乎无需人工干预。代码已开源在https://github.com/Trustworthy-ML-Lab/Label-free-CBM。最后,附录B中我们对方法的可解释性进行了大规模用户评估。