Neuro-symbolic AI bridges the gap between purely symbolic and neural approaches to learning. This often requires maximizing the likelihood of a symbolic constraint w.r.t the neural network's output distribution. Such output distributions are typically assumed to be fully-factorized. This limits the applicability of neuro-symbolic learning to the more expressive autoregressive distributions, e.g., transformers. Under such distributions, computing the likelihood of even simple constraints is #P-hard. Instead of attempting to enforce the constraint on the entire output distribution, we propose to do so on a random, local approximation thereof. More precisely, we optimize the likelihood of the constraint under a pseudolikelihood-based approximation centered around a model sample. Our approximation is factorized, allowing the reuse of solutions to sub-problems, a main tenet for efficiently computing neuro-symbolic losses. Moreover, it is a local, high-fidelity approximation of the likelihood, exhibiting low entropy and KL-divergence around the model sample. We evaluate our approach on Sudoku and shortest-path prediction cast as autoregressive generation, and observe that we greatly improve upon the base model's ability to predict logically-consistent outputs. We also evaluate on the task of detoxifying large language models. Using a simple constraint disallowing a list of toxic words, we are able to steer the model's outputs away from toxic generations, achieving SoTA detoxification compared to previous approaches.
翻译:神经符号人工智能弥合了纯符号方法与神经方法在学习中的差距。这通常需要最大化神经网络输出分布相对于符号约束的似然。此类输出分布通常假设为完全分解的。这限制了神经符号学习在更具表达能力的自回归分布(如Transformer)中的应用。在此类分布下,即使计算简单约束的似然也是#P-hard问题。我们提出不在整个输出分布上强制执行约束,而是在其随机局部近似上进行约束优化。具体而言,我们围绕模型样本优化基于伪似然近似的约束似然。我们的近似是分解的,允许复用子问题的解,这是高效计算神经符号损失的主要原则。此外,该近似具有局部高保真特性,在模型样本附近表现出低熵和低KL散度。我们在数独和最短路径预测任务(作为自回归生成任务)上评估了该方法,观察到模型在预测逻辑一致输出方面相较于基础模型有显著提升。我们还评估了大型语言模型解毒任务:通过使用简单的禁止毒性词列表约束,我们成功引导模型输出远离有毒生成内容,相较于先前方法实现了最先进的解毒效果。