Neurosymbolic programs combine deep learning with symbolic reasoning to achieve better data efficiency, interpretability, and generalizability compared to standalone deep learning approaches. However, existing neurosymbolic learning frameworks implement an uneasy marriage between a highly scalable, GPU-accelerated neural component with a slower symbolic component that runs on CPUs. We propose Lobster, a unified framework for harnessing GPUs in an end-to-end manner for neurosymbolic learning. Lobster maps a general neurosymbolic language based on Datalog to the GPU programming paradigm. This mapping is implemented via compilation to a new intermediate language called APM. The extra abstraction provided by APM allows Lobster to be both flexible, supporting discrete, probabilistic, and differentiable modes of reasoning on GPU hardware with a library of provenance semirings, and performant, implementing new optimization passes. We demonstrate that Lobster programs can solve interesting problems spanning the domains of natural language processing, image processing, program reasoning, bioinformatics, and planning. On a suite of 8 applications, Lobster achieves an average speedup of 5.3x over Scallop, a state-of-the-art neurosymbolic framework, and enables scaling of neurosymbolic solutions to previously infeasible tasks.
翻译:神经符号程序将深度学习与符号推理相结合,相比独立的深度学习方法,能够实现更高的数据效率、可解释性和泛化能力。然而,现有的神经符号学习框架在高度可扩展的GPU加速神经组件与运行于CPU的较慢符号组件之间实现了不稳定的结合。我们提出龙虾(Lobster),一个端到端利用GPU进行神经符号学习的统一框架。Lobster将基于Datalog的通用神经符号语言映射到GPU编程范式。该映射通过编译至一种名为APM的新型中间语言实现。APM提供的额外抽象层使Lobster兼具灵活性——通过溯源半环库支持GPU硬件上的离散、概率与可微推理模式,以及高性能——实现了新的优化通道。我们证明,Lobster程序能够解决涵盖自然语言处理、图像处理、程序推理、生物信息学与规划等多个领域的复杂问题。在包含8个应用的测试套件中,Lobster相比当前最先进的神经符号框架Scallop平均实现5.3倍加速,并将神经符号解决方案扩展至先前不可行的任务。