Natural generation allows Large Language Models (LLMs) to produce free-form responses with rich reasoning, yet the lack of structure makes outputs difficult to verify. Conversely, constrained decoding ensures standardized formats but can inadvertently restrict reasoning capabilities by imposing constraints too early in the generation process. We propose a hybrid approach, namely In-Writing, that combines free-form reasoning and structured generation in a single call. The model first performs unconstrained reasoning and only applies structured decoding after a trigger token is generated, explicitly decoupling reasoning from formatting. We establish that our trigger-token strategies are able to virtually eradicate premature triggering, a failure mode in which constrained decoding interrupts on-going reasoning. Evaluations across diverse datasets covering classification and reasoning tasks demonstrate that our approach outperforms the state-of-the-art by achieving accuracy gains of up to 27% over natural generation. Our code are available at: https://github.com/Nokia-Bell-Labs/InWriting.
翻译:自然生成允许大型语言模型(LLMs)产生自由形式的响应,具备丰富的推理能力,但因其缺乏结构而难以验证输出。相反,约束解码虽然能确保标准化格式,但若在生成过程中过早施加约束,则可能无意中限制推理能力。我们提出一种混合方法,名为In-Writing,该方法在单次调用中结合了自由形式推理与结构化生成。模型首先执行无约束推理,仅在生成触发词标记后才应用结构化解码,从而明确地将推理与格式化解耦。我们证实,这种触发词标记策略能够有效根除过早触发——即约束解码打断正在进行的推理的一种失败模式。在涵盖分类与推理任务的多样化数据集上的评估表明,我们的方法相较于自然生成实现了高达27%的准确率提升,超越了现有最佳技术。我们的代码已开源:https://github.com/Nokia-Bell-Labs/InWriting。