Neurosymbolic AI aims to integrate deep learning with symbolic AI. This integration has many promises, such as decreasing the amount of data required to train a neural network, improving the explainability and interpretability of answers given by models and verifying the correctness of trained systems. We study neurosymbolic learning, where we have both data and background knowledge expressed using symbolic languages. How do we connect the symbolic and neural components to communicate this knowledge? One option is fuzzy reasoning, which studies degrees of truth. For example, being tall is not a binary concept. Instead, probabilistic reasoning studies the probability that something is true or will happen. Our first research question studies how different forms of fuzzy reasoning combine with learning. We find surprising results like a connection to the Raven paradox stating we confirm "ravens are black" when we observe a green apple. In this study, we did not use the background knowledge when we deployed our models after training. In our second research question, we studied how to use background knowledge in deployed models. We developed a new neural network layer based on fuzzy reasoning. Probabilistic reasoning is a natural fit for neural networks, which we usually train to be probabilistic. However, they are expensive to compute and do not scale well to large tasks. In our third research question, we study how to connect probabilistic reasoning with neural networks by sampling to estimate averages, while in the final research question, we study scaling probabilistic neurosymbolic learning to much larger problems than before. Our insight is to train a neural network with synthetic data to predict the result of probabilistic reasoning.
翻译:神经符号人工智能旨在将深度学习与符号人工智能相融合。这种融合具有诸多优势,例如减少训练神经网络所需的数据量、提升模型输出结果的可解释性与透明性,以及验证训练系统的正确性。我们研究神经符号学习,其中我们同时拥有以符号语言表达的数据和背景知识。如何连接符号组件与神经组件来传递这些知识?一种选择是模糊推理,它研究真值的程度。例如,"高"并非一个二元概念。而概率推理则研究某事物为真或发生的概率。我们的第一个研究问题探讨了不同形式的模糊推理如何与学习相结合。我们发现了令人惊讶的结果,例如与乌鸦悖论的关联——该悖论指出,当我们观察到一只青苹果时,等同于证实了"乌鸦是黑色的"。在这项研究中,我们在训练后部署模型时未使用背景知识。在第二个研究问题中,我们研究了如何在已部署模型中使用背景知识。我们基于模糊推理开发了一种新的神经网络层。概率推理与神经网络自然契合,我们通常将神经网络训练为概率模型。然而,概率推理计算成本高昂,且难以扩展到大规模任务。在第三个研究问题中,我们研究了如何通过采样估计平均值来连接概率推理与神经网络;而在最后一个研究问题中,我们研究了如何将概率神经符号学习扩展到比以前更大规模的问题。我们的洞见是使用合成数据训练神经网络来预测概率推理的结果。