In this paper, we delve into the advancement of domain-specific Large Language Models (LLMs) with a focus on their application in software development. We introduce DevAssistLlama, a model developed through instruction tuning, to assist developers in processing software-related natural language queries. This model, a variant of instruction tuned LLM, is particularly adept at handling intricate technical documentation, enhancing developer capability in software specific tasks. The creation of DevAssistLlama involved constructing an extensive instruction dataset from various software systems, enabling effective handling of Named Entity Recognition (NER), Relation Extraction (RE), and Link Prediction (LP). Our results demonstrate DevAssistLlama's superior capabilities in these tasks, in comparison with other models including ChatGPT. This research not only highlights the potential of specialized LLMs in software development also the pioneer LLM for this domain.
翻译:本文深入探讨了领域特定大型语言模型(LLMs)的进展,重点关注其在软件开发中的应用。我们介绍了通过指令微调开发出的模型DevAssistLlama,以协助开发者处理与软件相关的自然语言查询。作为指令微调LLM的变体,该模型特别擅长处理复杂技术文档,提升开发者在软件特定任务中的能力。DevAssistLlama的构建涉及从多个软件系统中提取大规模指令数据集,使其能够有效处理命名实体识别(NER)、关系抽取(RE)和链接预测(LP)。我们的结果表明,DevAssistLlama在这些任务中展现出超越包括ChatGPT在内的其他模型的卓越性能。本研究不仅凸显了专用LLMs在软件开发中的潜力,也开创了该领域LLM的先河。