This paper presents a study on the feasibility of using large language models (LLM) for coding with low-resource and domain-specific programming languages that typically lack the amount of data required for effective LLM processing techniques. This study focuses on the econometric scripting language named hansl of the open-source software gretl and employs a proprietary LLM based on GPT-3.5. Our findings suggest that LLMs can be a useful tool for writing, understanding, improving, and documenting gretl code, which includes generating descriptive docstrings for functions and providing precise explanations for abstract and poorly documented econometric code. While the LLM showcased promoting docstring-to-code translation capability, we also identify some limitations, such as its inability to improve certain sections of code and to write accurate unit tests. This study is a step towards leveraging the power of LLMs to facilitate software development in low-resource programming languages and ultimately to lower barriers to entry for their adoption.
翻译:本文探讨了使用大型语言模型(LLM)对低资源及领域特定编程语言进行编码的可行性,这类语言通常缺乏有效应用LLM处理技术所需的数据量。本研究聚焦于开源软件gretl中的计量经济学脚本语言hansl,并采用基于GPT-3.5的专有LLM。我们的研究结果表明,LLM可成为编写、理解、改进和记录gretl代码的有效工具,包括为函数生成描述性文档字符串,以及为抽象且缺乏文档的计量经济学代码提供精确解释。尽管LLM展示了prompt驱动的文档字符串到代码转换能力,但我们仍识别出若干局限性,例如无法改进某些代码段以及编写准确的单元测试。本研究旨在推动利用LLM的力量促进低资源编程语言的软件开发,最终降低其应用门槛。