The availability of Large Language Models (LLMs) which can generate code, has made it possible to create tools that improve developer productivity. Integrated development environments or IDEs which developers use to write software are often used as an interface to interact with LLMs. Although many such tools have been released, almost all of them focus on general-purpose programming languages. Domain-specific languages, such as those crucial for Information Technology (IT) automation, have not received much attention. Ansible is one such YAML-based IT automation-specific language. Ansible Lightspeed is an LLM-based service designed explicitly to generate Ansible YAML given natural language prompt. This paper first presents the design and implementation of the Ansible Lightspeed service. We then evaluate its utility to developers using diverse indicators, including extended utilization, analysis of user rejected suggestions, as well as analysis of user sentiments. The analysis is based on data collected for 10,696 real users including 3,910 returning users. The code for Ansible Lightspeed service and the analysis framework is made available for others to use. To our knowledge, our study is the first to involve thousands of users in evaluating code assistants for domain-specific languages. We propose an improved version of user acceptance rate and we are the first code completion tool to present N-Day user retention figures. With our findings we provide insights into the effectiveness of small, dedicated models in a domain-specific context. We hope this work serves as a reference for software engineering and machine learning researchers exploring code completion services for domain-specific languages in particular and programming languages in general.
翻译:大型语言模型(LLM)生成代码的能力催生了提升开发者生产力的工具。开发者编写软件时使用的集成开发环境(IDE)常被用作与LLM交互的接口。尽管已有许多此类工具发布,但几乎全部聚焦于通用编程语言。对于信息技术(IT)自动化等领域至关重要的领域特定语言则未获足够关注。Ansible正是一种基于YAML的IT自动化专用语言。Ansible Lightspeed是专门设计用于根据自然语言提示生成Ansible YAML的LLM服务。本文首先阐述Ansible Lightspeed服务的设计与实现,随后通过多样化指标评估其对开发者的实用价值,包括持续使用情况分析、用户拒绝建议分析及用户情感分析。该分析基于10,696名真实用户(含3,910名回流用户)的采集数据。Ansible Lightspeed服务代码与分析框架已开源供社区使用。据我们所知,本研究首次在领域特定语言的代码助手评估中涵盖数千名用户。我们提出了改进版的用户接受率计算方法,并成为首个公布N日用户留存率的代码补全工具。基于研究结果,我们揭示了小型专用模型在领域特定场景中的有效性。本研究旨在为探索领域特定语言及通用编程语言代码补全服务的软件工程与机器学习研究者提供参考。