To Neuro-Symbolic Classification and Beyond by Compiling Description Logic Ontologies to Probabilistic Circuits

Background: Neuro-symbolic methods enhance the reliability of neural network classifiers through logical constraints, but they lack native support for ontologies. Objectives: We aim to develop a neuro-symbolic method that reliably outputs predictions consistent with a Description Logic ontology that formalizes domain-specific knowledge. Methods: We encode a Description Logic ontology as a circuit, a feed-forward differentiable computational graph that supports tractable execution of queries and transformations. We show that the circuit can be used to (i) generate synthetic datasets that capture the semantics of the ontology; (ii) efficiently perform deductive reasoning on a GPU; (iii) implement neuro-symbolic models whose predictions are approximately or provably consistent with the knowledge defined in the ontology. Results We show that the synthetic dataset generated using the circuit qualitatively captures the semantics of the ontology while being challenging for Machine Learning classifiers, including neural networks. Moreover, we show that compiling the ontology into a circuit is a promising approach for scalable deductive reasoning, with runtimes up to three orders of magnitude faster than available reasoners. Finally, we show that our neuro-symbolic classifiers reliably produce consistent predictions when compared to neural network baselines, maintaining competitive performances or even outperforming them. Conclusions By compiling Description Logic ontologies into circuits, we obtain a tighter integration between the Deep Learning and Knowledge Representation fields. We show that a single circuit representation can be used to tackle different challenging tasks closely related to real-world applications.

翻译：背景：神经符号方法通过逻辑约束提升神经网络分类器的可靠性，但缺乏对本体论的原生支持。目标：我们旨在开发一种神经符号方法，能够可靠地输出与形式化领域知识的描述逻辑本体论相一致的预测。方法：我们将描述逻辑本体编码为电路——一种前馈可微分计算图，支持查询和变换的可处理执行。我们证明该电路可用于：(i) 生成捕获本体语义的合成数据集；(ii) 在GPU上高效执行演绎推理；(iii) 实现预测结果与本体定义知识近似或可证明一致的神经符号模型。结果：我们证明使用电路生成的合成数据集在定性层面捕获了本体语义，同时对包括神经网络在内的机器学习分类器具有挑战性。此外，我们证明将本体编译为电路是实现可扩展演绎推理的有效途径，其运行速度比现有推理器快达三个数量级。最后，实验表明与神经网络基线相比，我们的神经符号分类器能可靠地生成一致预测，在保持竞争力的同时甚至表现更优。结论：通过将描述逻辑本体编译为电路，我们实现了深度学习与知识表示领域更紧密的集成。研究表明，单一电路表征可用于处理与现实应用密切相关的多种挑战性任务。