This paper introduces TRACE-GPT, which stands for Time-seRies Anomaly-detection with Convolutional Embedding and Generative Pre-trained Transformers. TRACE-GPT is designed to pre-train univariate time-series sensor data and detect faults on unlabeled datasets in semiconductor manufacturing. In semiconductor industry, classifying abnormal time-series sensor data from normal data is important because it is directly related to wafer defect. However, small, unlabeled, and even mixed training data without enough anomalies make classification tasks difficult. In this research, we capture features of time-series data with temporal convolutional embedding and Generative Pre-trained Transformer (GPT) to classify abnormal sequences from normal sequences using cross entropy loss. We prove that our model shows better performance than previous unsupervised models with both an open dataset, the University of California Riverside (UCR) time-series classification archive, and the process log of our Chemical Vapor Deposition (CVD) equipment. Our model has the highest F1 score at Equal Error Rate (EER) across all datasets and is only 0.026 below the supervised state-of-the-art baseline on the open dataset.
翻译:本文介绍了TRACE-GPT,即具有卷积嵌入和生成式预训练Transformer的时间序列异常检测模型。TRACE-GPT旨在对单变量时间序列传感器数据进行预训练,并在半导体制造中对无标签数据集进行故障检测。在半导体行业中,从正常数据中分类异常时间序列传感器数据至关重要,因为这直接关系到晶圆缺陷。然而,训练数据规模小、无标签且混合,且缺乏足够的异常样本,使得分类任务困难重重。在本研究中,我们利用时间卷积嵌入和生成式预训练Transformer(GPT)捕捉时间序列数据的特征,通过交叉熵损失从正常序列中分类异常序列。我们证明了我们的模型在公开数据集(加州大学河滨分校时间序列分类档案)以及我们的化学气相沉积设备工艺日志上均优于以往的无监督模型。我们的模型在所有数据集上的等错误率下F1分数最高,且在公开数据集上仅比有监督的最先进基线低0.026。