Concept Bottleneck Models (CBM) are inherently interpretable models that factor model decisions into human-readable concepts. They allow people to easily understand why a model is failing, a critical feature for high-stakes applications. CBMs require manually specified concepts and often under-perform their black box counterparts, preventing their broad adoption. We address these shortcomings and are first to show how to construct high-performance CBMs without manual specification of similar accuracy to black box models. Our approach, Language Guided Bottlenecks (LaBo), leverages a language model, GPT-3, to define a large space of possible bottlenecks. Given a problem domain, LaBo uses GPT-3 to produce factual sentences about categories to form candidate concepts. LaBo efficiently searches possible bottlenecks through a novel submodular utility that promotes the selection of discriminative and diverse information. Ultimately, GPT-3's sentential concepts can be aligned to images using CLIP, to form a bottleneck layer. Experiments demonstrate that LaBo is a highly effective prior for concepts important to visual recognition. In the evaluation with 11 diverse datasets, LaBo bottlenecks excel at few-shot classification: they are 11.7% more accurate than black box linear probes at 1 shot and comparable with more data. Overall, LaBo demonstrates that inherently interpretable models can be widely applied at similar, or better, performance than black box approaches.
翻译:概念瓶颈模型(CBM)是一种内在可解释的模型,可将模型决策分解为人类可理解的概念。它们使人们能够轻松理解模型为何失效,这一特性对于高风险应用至关重要。CBMs需要手动指定概念,且通常性能不及黑盒模型,这阻碍了其广泛应用。我们解决了这些缺陷,并首次展示了如何构建与黑盒模型精度相似的高性能CBM,且无需手动指定概念。我们的方法——语言引导瓶颈(LaBo),利用语言模型GPT-3定义了一个广阔的潜在瓶颈空间。针对特定问题领域,LaBo使用GPT-3生成关于类别的事实性句子以形成候选概念。LaBo通过一种新颖的子模效用函数高效搜索可能的瓶颈,该函数促进选择具有判别性和多样性的信息。最终,GPT-3生成的句子概念可通过CLIP与图像对齐,形成瓶颈层。实验表明,LaBo是对视觉识别重要概念的高度有效先验。在11个多样化数据集的评估中,LaBo瓶颈在少样本分类中表现优异:在单样本设置下,其准确率比黑盒线性探测高11.7%,且在更多数据下性能相当。总体而言,LaBo证明内在可解释模型可在与黑盒方法相当或更优的性能下广泛应用。