In this paper, we delve into the advancement of domain-specific Large Language Models (LLMs) with a focus on their application in software development. We introduce DevAssistLlama, a model developed through instruction tuning, to assist developers in processing software-related natural language queries. This model, a variant of instruction tuned LLM, is particularly adept at handling intricate technical documentation, enhancing developer capability in software specific tasks. The creation of DevAssistLlama involved constructing an extensive instruction dataset from various software systems, enabling effective handling of Named Entity Recognition (NER), Relation Extraction (RE), and Link Prediction (LP). Our results demonstrate DevAssistLlama's superior capabilities in these tasks, in comparison with other models including ChatGPT. This research not only highlights the potential of specialized LLMs in software development also the pioneer LLM for this domain.
翻译:本文深入探讨了领域特定大型语言模型(LLMs)的进展,重点关注其在软件开发中的应用。我们介绍了通过指令微调开发的DevAssistLlama模型,该模型旨在辅助开发者处理与软件相关的自然语言查询。作为指令微调LLM的一个变体,该模型特别擅长处理复杂的技术文档,从而增强开发者在软件特定任务中的能力。DevAssistLlama的创建涉及从多种软件系统中构建大规模的指令数据集,使其能够有效处理命名实体识别(NER)、关系抽取(RE)和链接预测(LP)任务。我们的结果表明,与包括ChatGPT在内的其他模型相比,DevAssistLlama在这些任务中展现出卓越的性能。这项研究不仅凸显了专用LLMs在软件开发中的潜力,也标志着该领域首个开创性LLM的出现。