Recent advances in large language models (LLMs) for code applications have demonstrated remarkable zero-shot fluency and instruction following on challenging code related tasks ranging from test case generation to self-repair. Unsurprisingly, however, models struggle to compose syntactically valid programs in programming languages unrepresented in pre-training, referred to as very low-resource Programming Languages (VLPLs). VLPLs appear in crucial settings including domain-specific languages for internal to tools and tool-chains and legacy languages. Inspired by an HCI technique called natural program elicitation, we propose designing an intermediate language that LLMs ``naturally'' know how to use and which can be automatically compiled to the target VLPL. Specifically, we introduce synthetic programming elicitation and compilation (SPEAK), an approach that enables LLMs to generate syntactically valid code even for VLPLs. We empirically evaluate the performance of SPEAK in a case study and find that, compared to existing retrieval and fine-tuning baselines, SPEAK produces syntactically correct programs more frequently without sacrificing semantic correctness.
翻译:近期在代码应用领域的大型语言模型(LLMs)取得了显著进展,其在从测试用例生成到自我修复等一系列具有挑战性的代码相关任务中,展现出卓越的零样本流畅性与指令遵循能力。然而,模型在预训练数据中未出现过的编程语言(称为超低资源编程语言,VLPLs)上难以组合出语法有效的程序,这一点并不令人意外。VLPLs出现在许多关键场景中,包括工具及工具链内部的领域特定语言以及遗留语言。受一种称为自然程序引导的人机交互技术启发,我们提出设计一种LLMs能够“自然”掌握使用、并可自动编译为目标VLPL的中间语言。具体而言,我们引入了合成编程引导与编译(SPEAK)方法,该方法使得LLMs即使对于VLPLs也能生成语法有效的代码。我们通过案例研究对SPEAK的性能进行了实证评估,发现与现有的检索和微调基线相比,SPEAK在不牺牲语义正确性的前提下,更频繁地生成语法正确的程序。