Neural-symbolic approaches have recently gained popularity to inject prior knowledge into a learner without requiring it to induce this knowledge from data. These approaches can potentially learn competitive solutions with a significant reduction of the amount of supervised data. A large class of neural-symbolic approaches is based on First-Order Logic to represent prior knowledge, relaxed to a differentiable form using fuzzy logic. This paper shows that the loss function expressing these neural-symbolic learning tasks can be unambiguously determined given the selection of a t-norm generator. When restricted to supervised learning, the presented theoretical apparatus provides a clean justification to the popular cross-entropy loss, which has been shown to provide faster convergence and to reduce the vanishing gradient problem in very deep structures. However, the proposed learning formulation extends the advantages of the cross-entropy loss to the general knowledge that can be represented by a neural-symbolic method. Therefore, the methodology allows the development of a novel class of loss functions, which are shown in the experimental results to lead to faster convergence rates than the approaches previously proposed in the literature.
翻译:神经符号方法最近备受关注,它能够将先验知识注入学习器,而无需学习器从数据中归纳这些知识。这类方法能以显著减少监督数据量的方式,学习具有竞争力的解决方案。大多数神经符号方法基于一阶逻辑表示先验知识,并通过模糊逻辑将其松弛为可微分形式。本文证明,给定t-范数生成器的选择,表达这些神经符号学习任务的损失函数可被唯一确定。当限制于监督学习时,所提出的理论框架为广受欢迎的交叉熵损失提供了清晰的解释——该损失已被证明能加速收敛并缓解极深结构中的梯度消失问题。然而,所提出的学习公式将交叉熵损失的优势扩展到了神经符号方法可表示的通用知识领域。因此,该方法论能够开发一类新型损失函数,实验结果表明,这类损失函数相比文献中先前提出的方法,能实现更快的收敛速率。