In this paper we present the first investigation into the effectiveness of Large Language Models (LLMs) for Failure Mode Classification (FMC). FMC, the task of automatically labelling an observation with a corresponding failure mode code, is a critical task in the maintenance domain as it reduces the need for reliability engineers to spend their time manually analysing work orders. We detail our approach to prompt engineering to enable an LLM to predict the failure mode of a given observation using a restricted code list. We demonstrate that the performance of a GPT-3.5 model (F1=0.80) fine-tuned on annotated data is a significant improvement over a currently available text classification model (F1=0.60) trained on the same annotated data set. The fine-tuned model also outperforms the out-of-the box GPT-3.5 (F1=0.46). This investigation reinforces the need for high quality fine-tuning data sets for domain-specific tasks using LLMs.
翻译:本文首次研究了大型语言模型(LLMs)在故障模式分类(FMC)中的有效性。故障模式分类作为一项自动为观测数据标注对应故障模式代码的任务,在维护领域至关重要,因为它能减少可靠性工程师手动分析工单的时间投入。我们详细阐述了通过提示工程使LLM基于受限代码列表预测给定观测数据故障模式的方法。实验证明,基于标注数据微调的GPT-3.5模型(F1=0.80)相较于当前基于相同标注数据集训练的文本分类模型(F1=0.60)性能显著提升。微调模型同样优于未经微调的GPT-3.5基线模型(F1=0.46)。本研究进一步证实了在领域特定任务中,使用LLM需要构建高质量微调数据集。