Neuro-Symbolic Concept-based Models (NeSy-CBMs) are a family of architectures that integrate neural networks with symbolic reasoning for enhanced reliability in high-stakes applications. They work by first extracting high-level concepts from the input and then inferring a task label from these compatibly with given logical constraints. Yet, their label and concept predictions can be overconfident, making it difficult for stakeholders to gauge when the model's decisions can be trusted. We address this issue by integrating ideas from Conformal Prediction (CP), a framework providing rigorous, distribution-free coverage guarantees. We formalize three desiderata -- consistency, coverage, and conciseness -- that any conformal method for NeSy-CBMs should satisfy, and show that existing approaches fall short of at least one. We then introduce COCOCO, a post-hoc framework that conformalizes concepts and labels jointly and reconciles them via a single deduction-abduction revision step. COCOCO satisfies all three desiderata, retains distribution-free coverage, is robust to imperfect knowledge and supports user-specified size budgets. Our experiments on 8 data sets highlight how COCOCO compares favorably against competitors and natural baselines in terms of performance and set size.
翻译:神经符号概念模型是一类将神经网络与符号推理相结合的架构,旨在增强高风险应用中的可靠性。其工作流程为:首先从输入中提取高层概念,然后在符合给定逻辑约束的前提下,根据这些概念推断任务标签。然而,其标签和概念预测可能过于自信,导致利益相关者难以判断模型决策何时可信。我们通过整合共形预测(一种提供严格无分布覆盖保证的框架)的思想来解决这一问题。我们形式化了神经符号概念模型共形方法应满足的三个目标——一致性、覆盖率和简洁性,并表明现有方法至少未能满足其中之一。随后,我们提出COCOCO,一种事后框架,联合地对概念和标签进行共形化处理,并通过单次演绎-溯因修订步骤协调二者。COCOCO满足所有三个目标,保持无分布覆盖,对不完美知识具有鲁棒性,并支持用户指定的规模预算。我们在8个数据集上的实验表明,COCOCO在性能和集合规模方面优于竞争方法和自然基线。