Large language models have demonstrated impressive capabilities across various domains. However, their general-purpose nature often limits their effectiveness in specialized fields such as energy, where deep technical expertise and precise domain knowledge are essential. In this paper, we introduce EnergyGPT, a domain-specialized language model tailored for the energy sector, developed by fine-tuning the LLaMA 3.1-8B model on a high-quality, curated corpus of energy-related texts. We consider two adaptation strategies: a full-parameter Supervised Fine-Tuning variant and a parameter-efficient LoRA-based variant that updates only a small fraction of the model parameters. We present a complete development pipeline, including data collection and curation, model fine-tuning, benchmark design and LLM-judge choice, evaluation, and deployment. Through this work, we demonstrate that our training strategy enables improvements in domain relevance and performance without the need for large-scale infrastructure. By evaluating the performance of both EnergyGPT variants using domain-specific question-answering benchmarks, our results show that the adapted models consistently outperform the base model in most energy-related language understanding and generation tasks, with the LoRA variant achieving competitive gains at significantly reduced training cost.
翻译:大语言模型已在多个领域展现出卓越能力。然而,其通用性特质常限制其在能源等专业领域的应用效果,这些领域需要深厚的技术专精与精确的领域知识。本文介绍EnergyGPT,一个专为能源领域定制的领域专用语言模型。该模型通过在高质量、经人工筛选的能源相关文本语料上对LLaMA 3.1-8B模型进行微调而开发。我们考虑了两种适应策略:全参数监督微调变体与基于LoRA的参数高效变体,后者仅更新模型参数的极小部分。我们提出了完整的开发流程,涵盖数据收集与整理、模型微调、基准测试设计与LLM评判器选择、评估及部署。通过本项工作,我们证明了所采用的训练策略能够在无需大规模基础设施的情况下,提升模型的领域相关性与性能。通过使用领域特定的问答基准测试对两种EnergyGPT变体进行评估,结果表明,适应后的模型在大多数能源相关语言理解与生成任务中均持续优于基础模型,其中LoRA变体以显著降低的训练成本实现了具有竞争力的性能提升。