Many computational tasks can be naturally expressed as a composition of a DNN followed by a program written in a traditional programming language or an API call to an LLM. We call such composites "neural programs" and focus on the problem of learning the DNN parameters when the training data consist of end-to-end input-output labels for the composite. When the program is written in a differentiable logic programming language, techniques from neurosymbolic learning are applicable, but in general, the learning for neural programs requires estimating the gradients of black-box components. We present an algorithm for learning neural programs, called ISED, that only relies on input-output samples of black-box components. For evaluation, we introduce new benchmarks that involve calls to modern LLMs such as GPT-4 and also consider benchmarks from the neurosymolic learning literature. Our evaluation shows that for the latter benchmarks, ISED has comparable performance to state-of-the-art neurosymbolic frameworks. For the former, we use adaptations of prior work on gradient approximations of black-box components as a baseline, and show that ISED achieves comparable accuracy but in a more data- and sample-efficient manner.
翻译:许多计算任务可以自然地表达为一个深度神经网络(DNN)与一个用传统编程语言编写的程序或一个大型语言模型(LLM)的API调用所组成的复合结构。我们将此类复合体称为“神经程序”,并专注于在训练数据仅包含复合体端到端输入输出标签的情况下学习DNN参数的问题。当程序采用可微逻辑编程语言编写时,神经符号学习技术是适用的;然而在一般情况下,神经程序的学习需要估计黑盒组件的梯度。我们提出了一种名为ISED的学习神经程序的算法,该算法仅依赖于黑盒组件的输入输出样本。为进行评估,我们引入了涉及调用现代LLM(如GPT-4)的新基准测试,并同时考虑了神经符号学习文献中的既有基准。评估结果表明:对于后者,ISED的性能与最先进的神经符号框架相当;对于前者,我们采用先前关于黑盒组件梯度近似的研究作为基线,并证明ISED在保持相当准确度的同时,能以更高的数据和样本效率实现学习目标。