Ensuring both syntactic and semantic correctness in Large Language Model (LLM) outputs remains a significant challenge, despite being critical for real-world deployment. In this paper, we introduce \texttt{SEM-CTRL}, a unified approach that allows for enforcing rich context-sensitive constraints, and task and instance specific semantics directly on the LLM decoder. Our approach integrates token-level MCTS which is guided by specific syntactic and semantic constraints. The constraints over desired outputs are expressed using Answer Set Grammars, which is a logic-based formalism that generalizes context sensitive grammars while incorporating background knowledge to represent task-specific semantics. We show that our approach helps guarantee valid completions for any off-the-shelf LLM without the need for fine-tuning. We evaluate \texttt{SEM-CTRL} on a range of tasks, including synthetic grammar synthesis, combinatorial reasoning, JSON parsing, and planning. Our experimental results demonstrate that \texttt{SEM-CTRL} allows even small pre-trained LLMs to efficiently outperform larger variants and state-of-the-art reasoning models (e.g., \textit{o4-mini}) while simultaneously guaranteeing semantic validity.
翻译:确保大型语言模型(LLM)输出在句法和语义上的正确性,对于其实际部署至关重要,但这仍然是一个重大挑战。本文提出了 \texttt{SEM-CTRL},这是一种统一的方法,允许直接在LLM解码器上强制执行丰富的上下文敏感约束,以及任务和实例特定的语义。我们的方法集成了由特定句法和语义约束引导的令牌级蒙特卡洛树搜索(MCTS)。对期望输出的约束使用答案集语法(Answer Set Grammars)来表达,这是一种基于逻辑的形式化方法,它泛化了上下文敏感语法,同时结合了背景知识来表示任务特定的语义。我们表明,我们的方法有助于保证任何现成的LLM都能生成有效的补全,而无需进行微调。我们在包括合成语法生成、组合推理、JSON解析和规划在内的一系列任务上评估了 \texttt{SEM-CTRL}。我们的实验结果表明,\texttt{SEM-CTRL} 使得即使是小型预训练LLM也能高效地超越更大的变体以及最先进的推理模型(例如 \textit{o4-mini}),同时保证语义的有效性。