Automated Code generation for Information Technology Tasks in YAML through Large Language Models

The recent improvement in code generation capabilities due to the use of large language models has mainly benefited general purpose programming languages. Domain specific languages, such as the ones used for IT Automation, have received far less attention, despite involving many active developers and being an essential component of modern cloud platforms. This work focuses on the generation of Ansible-YAML, a widely used markup language for IT Automation. We present Ansible Wisdom, a natural-language to Ansible-YAML code generation tool, aimed at improving IT automation productivity. Ansible Wisdom is a transformer-based model, extended by training with a new dataset containing Ansible-YAML. We also develop two novel performance metrics for YAML and Ansible to capture the specific characteristics of this domain. Results show that Ansible Wisdom can accurately generate Ansible script from natural language prompts with performance comparable or better than existing state of the art code generation models. In few-shot settings we asses the impact of training with Ansible, YAML data and compare with different baselines including Codex-Davinci-002. We also show that after finetuning, our Ansible specific model (BLEU: 66.67) can outperform a much larger Codex-Davinci-002 (BLEU: 50.4) model, which was evaluated in few shot settings.

翻译：近期，大语言模型在代码生成能力上的显著提升主要惠及通用编程语言。针对IT自动化等领域的领域特定语言——尽管涉及众多活跃开发者且是现代云平台的关键组件——却鲜受关注。本研究聚焦于Ansible-YAML（IT自动化领域广泛使用的标记语言）的代码生成。我们提出Ansible Wisdom——一种面向自然语言到Ansible-YAML代码的生成工具，旨在提升IT自动化生产效率。该工具基于Transformer架构，并通过包含Ansible-YAML的新数据集进行扩展训练。我们还针对YAML和Ansible开发了两项新型性能评估指标，以捕捉该领域的独特特性。实验结果表明，Ansible Wisdom能够根据自然语言提示准确生成Ansible脚本，其性能与现有最先进的代码生成模型相当或更优。在少样本场景下，我们评估了基于Ansible和YAML数据训练的影响，并与包括Codex-Davinci-002在内的多种基线模型进行对比。研究还发现，经过微调后，我们的Ansible专用模型（BLEU值：66.67）能够超越在少样本设置下评估的、规模更大的Codex-Davinci-002模型（BLEU值：50.4）。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

最新《Transformers模型》教程，64页ppt

专知会员服务

326+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日