Despite their proficiency in various language tasks, Large Language Models (LLMs) struggle with combinatorial problems like Satisfiability, Traveling Salesman Problem, or even basic arithmetic. We address this gap through a novel trial & error approach for solving problems in the class NP, where candidate solutions are iteratively generated and efficiently validated using verifiers. We focus on the paradigmatic task of Sudoku and achieve state-of-the-art accuracy (99%) compared to prior neuro-symbolic approaches. Unlike prior work that used custom architectures, our method employs a vanilla decoder-only Transformer (GPT-2) without external tools or function calling. Our method integrates imitation learning of simple Sudoku rules with an explicit Depth-First Search (DFS) exploration strategy involving informed guessing and backtracking. Moving beyond imitation learning, we seek to minimize the number of guesses until reaching a solution. This is achieved using depth-1 guessing, showing empirically that almost all Sudoku can be solved using the puzzle's rules with at most one guess. We provide a rigorous analysis of this setup formalizing its connection to a contextual variant of Min-Sum Set Cover, a well-studied problem in algorithms and stochastic optimization.
翻译:尽管大型语言模型(LLM)在各种语言任务中表现优异,但在处理可满足性问题、旅行商问题乃至基础算术等组合优化问题时仍面临困难。我们通过一种新颖的试错方法来解决NP类问题,该方法通过迭代生成候选解并利用验证器进行高效验证。我们以数独这一范式性任务为研究对象,相比先前的神经符号方法实现了最先进的准确率(99%)。与以往使用定制架构的研究不同,我们的方法采用纯解码器Transformer(GPT-2)架构,无需外部工具或函数调用。该方法将简单数独规则的模仿学习与显式深度优先搜索(DFS)探索策略相结合,该策略包含启发式猜测与回溯机制。在超越模仿学习的基础上,我们致力于最小化抵达解所需的猜测次数。通过采用深度为1的猜测策略,实验表明几乎所有数独均可借助谜题规则通过至多一次猜测得以求解。我们对此设置进行了严格的形式化分析,建立了其与最小和集合覆盖问题的上下文变体之间的理论关联,后者是算法与随机优化领域中备受关注的研究课题。