The recent improvement in code generation capabilities due to the use of large language models has mainly benefited general purpose programming languages. Domain specific languages, such as the ones used for IT Automation, have received far less attention, despite involving many active developers and being an essential component of modern cloud platforms. This work focuses on the generation of Ansible-YAML, a widely used markup language for IT Automation. We present Ansible Wisdom, a natural-language to Ansible-YAML code generation tool, aimed at improving IT automation productivity. Ansible Wisdom is a transformer-based model, extended by training with a new dataset containing Ansible-YAML. We also develop two novel performance metrics for YAML and Ansible to capture the specific characteristics of this domain. Results show that Ansible Wisdom can accurately generate Ansible script from natural language prompts with performance comparable or better than existing state of the art code generation models. In few-shot settings we asses the impact of training with Ansible, YAML data and compare with different baselines including Codex-Davinci-002. We also show that after finetuning, our Ansible specific model (BLEU: 66.67) can outperform a much larger Codex-Davinci-002 (BLEU: 50.4) model, which was evaluated in few shot settings.
翻译:近期,大语言模型在代码生成能力上的显著提升主要惠及通用编程语言。针对IT自动化等领域的领域特定语言——尽管涉及众多活跃开发者且是现代云平台的关键组件——却鲜受关注。本研究聚焦于Ansible-YAML(IT自动化领域广泛使用的标记语言)的代码生成。我们提出Ansible Wisdom——一种面向自然语言到Ansible-YAML代码的生成工具,旨在提升IT自动化生产效率。该工具基于Transformer架构,并通过包含Ansible-YAML的新数据集进行扩展训练。我们还针对YAML和Ansible开发了两项新型性能评估指标,以捕捉该领域的独特特性。实验结果表明,Ansible Wisdom能够根据自然语言提示准确生成Ansible脚本,其性能与现有最先进的代码生成模型相当或更优。在少样本场景下,我们评估了基于Ansible和YAML数据训练的影响,并与包括Codex-Davinci-002在内的多种基线模型进行对比。研究还发现,经过微调后,我们的Ansible专用模型(BLEU值:66.67)能够超越在少样本设置下评估的、规模更大的Codex-Davinci-002模型(BLEU值:50.4)。