The trade-off between expressiveness and interpretability remains a core challenge when building human-centric predictive models for classification and decision-making. While symbolic rules offer interpretability, they often lack expressiveness, whereas neural networks excel in performance but are known for being black boxes. In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge this gap. In the proposed LLM-based Symbolic Programs (LSPs), the pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts. Symbolic programs then integrate these modules into an interpretable decision rule. To train LSPs, we develop a divide-and-conquer approach to incrementally build the program from scratch, where the learning process of each step is guided by LLMs. To evaluate the effectiveness of LSPs in extracting interpretable and accurate knowledge from data, we introduce IL-Bench, a collection of diverse tasks, including both synthetic and real-world scenarios across different modalities. Empirical results demonstrate LSP's superior performance compared to traditional neurosymbolic programs and vanilla automatic prompt tuning methods. Moreover, as the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable), and other LLMs, and generalizes well to out-of-distribution samples.
翻译:在构建面向人类的分类与决策预测模型时,表达能力与可解释性之间的权衡始终是一个核心挑战。符号规则虽具可解释性,却常缺乏表达能力;神经网络虽在性能上表现卓越,却以黑箱特性著称。本文证明,大型语言模型(LLMs)与符号程序的结合能够弥合这一鸿沟。在所提出的基于LLM的符号程序(LSPs)中,经过预训练的大型语言模型通过自然语言提示提供大量可解释模块,能够将原始输入转化为自然语言概念。符号程序随后将这些模块整合为可解释的决策规则。为训练LSPs,我们提出一种分治方法,从零开始逐步构建程序,其中每个步骤的学习过程均由LLMs引导。为评估LSPs从数据中提取可解释且准确的知识的有效性,我们构建了IL-Bench基准测试集,涵盖多模态下合成与真实场景的多样化任务。实证结果表明,相较于传统神经符号程序与原始自动提示调优方法,LSPs具有更优越的性能。此外,由于LSP习得的知识是自然语言描述与符号规则的结合体,其可轻松迁移至人类(可解释)、其他LLMs,并在分布外样本上表现出良好的泛化能力。