Reasoning Distillation and Structural Alignment for Improved Code Generation

Effective code generation with language models hinges on two critical factors: accurately understanding the intent of the prompt and generating code that applies algorithmic reasoning to produce correct solutions capable of passing diverse test cases while adhering to the syntax of the target programming language. Unlike other language tasks, code generation requires more than accurate token prediction; it demands comprehension of solution-level and structural relationships rather than merely generating the most likely tokens. very large language model (VLLM) are capable of generating detailed steps toward the correct solution of complex tasks where reasoning is crucial in solving the problem. Such reasoning capabilities may be absent in smaller language models. Therefore, in this work, we distill the reasoning capabilities of a VLLM into a smaller, more efficient model that is faster and cheaper to deploy. Our approach trains the model to emulate the reasoning and problem-solving abilities of the VLLM by learning to identify correct solution pathways and establishing a structural correspondence between problem definitions and potential solutions through a novel method of structure-aware loss optimization. This enables the model to transcend token-level generation and to deeply grasp the overarching structure of solutions for given problems. Experimental results show that our fine-tuned model, developed through a cheap and simple to implement process, significantly outperforms our baseline model in terms of pass@1, average data flow, and average syntax match metrics across the MBPP, MBPP Plus, and HumanEval benchmarks.

翻译：语言模型实现有效代码生成的关键在于两个核心因素：准确理解提示的意图，以及生成能够应用算法推理以产生正确解决方案的代码，这些方案需能通过多样化的测试用例，同时符合目标编程语言的语法规范。与其他语言任务不同，代码生成不仅需要准确的词元预测，更要求理解解决方案层面和结构上的关联，而非仅仅生成最可能的词元序列。超大规模语言模型（VLLM）能够为复杂任务生成通向正确解决方案的详细步骤，其中推理能力对问题解决至关重要。此类推理能力在较小规模的语言模型中可能缺失。因此，在本研究中，我们将VLLM的推理能力蒸馏到一个更小、更高效的模型中，该模型部署速度更快、成本更低。我们的方法通过一种新颖的结构感知损失优化方法，训练模型学习识别正确的解决路径，并在问题定义与潜在解决方案之间建立结构对应关系，从而模拟VLLM的推理与问题解决能力。这使得模型能够超越词元级别的生成，深入把握给定问题解决方案的整体结构。实验结果表明，通过一个廉价且易于实施的流程开发出的微调模型，在MBPP、MBPP Plus和HumanEval基准测试中，在pass@1、平均数据流和平均语法匹配等指标上均显著优于基线模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/